jarxiv | Japanese arxiv | ページ 403

Multi-modal Integration Analysis of Alzheimer’s Disease Using Large Language Models and Knowledge Graphs

投稿日: 2025年5月22日作成者: jarxiv

要約

大規模な言語モデル（LLMS）と知識グラフを使用したアルツハイマー病（AD）の研究に断片化されたマルチモーダルデータを統合するための新しいフレームワークを提案します。
従来のマルチモーダル分析では、データセット全体で一致する患者IDが必要ですが、私たちのアプローチでは、MRI、遺伝子発現、バイオマーカー、EEG、および独立コホートからの臨床指標の人口レベルの統合が示されています。
統計分析では、知識グラフのノードとして接続された各モダリティの重要な機能を特定しました。
次に、LLMSはグラフを分析して潜在的な相関を抽出し、自然言語で仮説を生成しました。
このアプローチは、代謝リスク因子を神経炎症（r> 0.6、p <0.001）を介したタウタンパク質の異常を結びつける潜在的な経路、および正面EEGチャネルと特定の遺伝子発現プロファイルとの予期しない相関（r = 0.42-0.58、p <0.01）を含む、いくつかの新しい関係を明らかにしました。独立したデータセットとの相互検証により、主要な発見の堅牢性が確認され、コホート全体で一貫した効果サイズがあります（分散<15％）。これらの調査結果の再現性は、専門家のレビュー（CohenのK = 0.82）および計算検証によってさらにサポートされていました。私たちのフレームワークにより、患者IDマッチングを必要とせずに概念的レベルでクロスモーダル統合が可能になり、断片化されたデータの再利用と将来の研究のためのテスト可能な仮説を生成することにより、広告の病理を理解するための新しい可能性を提供します。

要約(オリジナル)

We propose a novel framework for integrating fragmented multi-modal data in Alzheimer’s disease (AD) research using large language models (LLMs) and knowledge graphs. While traditional multimodal analysis requires matched patient IDs across datasets, our approach demonstrates population-level integration of MRI, gene expression, biomarkers, EEG, and clinical indicators from independent cohorts. Statistical analysis identified significant features in each modality, which were connected as nodes in a knowledge graph. LLMs then analyzed the graph to extract potential correlations and generate hypotheses in natural language. This approach revealed several novel relationships, including a potential pathway linking metabolic risk factors to tau protein abnormalities via neuroinflammation (r>0.6, p<0.001), and unexpected correlations between frontal EEG channels and specific gene expression profiles (r=0.42-0.58, p<0.01). Cross-validation with independent datasets confirmed the robustness of major findings, with consistent effect sizes across cohorts (variance <15%). The reproducibility of these findings was further supported by expert review (Cohen's k=0.82) and computational validation. Our framework enables cross modal integration at a conceptual level without requiring patient ID matching, offering new possibilities for understanding AD pathology through fragmented data reuse and generating testable hypotheses for future research.

arxiv情報

著者	Kanan Kiguchi,Yunhao Tu,Katsuhiro Ajito,Fady Alnajjar,Kazuyuki Murase
発行日	2025-05-21 16:51:49+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG, I.2.1 | コメントを受け付けていません

dMel: Speech Tokenization made Simple

投稿日: 2025年5月22日作成者: jarxiv

要約

大規模な言語モデルは、膨大なテキストデータに自己監視された事前供与を活用することにより、自然言語処理に革命をもたらしました。
この成功に触発されて、研究者は、連続音声信号を離散化するためのさまざまな圧縮ベースの音声トークン化方法を調査し、言語モデリング手法の個別のトークンへの適用を可能にしました。
ただし、オーディオコンプレッサーは追加の複雑さと計算コストを導入し、多くの場合、ドメイン外のオーディオ信号で失敗します。
この作業では、メルフィルターバンクチャネルを強度ビンに離散化する新しい音声表現（DMEL）を導入し、既存の音声トークン化方法と比較して、よりシンプルなさらに効果的な表現を作成します。
私たちのアプローチは、オーディオコンテンツの保存における優れたパフォーマンス、ドメイン外データへの堅牢性を示し、トレーニングのない自然でストリーミング可能な表現を提供します。
LOG-MELスペクトログラムの高次元性に対処するために、LMスタイルの変圧器アーキテクチャを使用した高次元トークンの効率的な並列エンコードおよびデコード方法を提案します。
このイノベーションにより、RichttsとRichasrを開発することができます。これは、同じアーキテクチャを共有しながら、特殊な既存の方法よりも同等またはより良い結果を達成しながら、同じアーキテクチャを共有しています。
我々の結果は、統一されたフレームワーク内の音声統合と認識タスクの両方で高性能を達成する際のDMELの有効性を示し、音声とテキストの効率的かつ効果的な共同モデリングのための道を開いています。

要約(オリジナル)

Large language models have revolutionized natural language processing by leveraging self-supervised pretraining on vast textual data. Inspired by this success, researchers have investigated various compression-based speech tokenization methods to discretize continuous speech signals, enabling the application of language modeling techniques to discrete tokens. However, audio compressor introduces additional complexity and computational cost, and often fail on out-of-domain audio signals. In this work, we introduce a novel speech representation (dmel) that discretizes mel-filterbank channels into intensity bins, creating a simpler yet more effective representation compared to existing speech tokenization methods. Our approach demonstrates superior performance in preserving audio content, robustness to out-of-domain data, and offers a training-free, natural, and streamable representation. To address the high-dimensional nature of log-mel spectrograms, we propose an efficient parallel encoding and decoding method for high-dimensional tokens using an LM-style transformer architecture. This innovation enables us to develop RichTTS and RichASR, two models sharing the same architecture while achieving comparable or better results than specialized existing methods. Our results demonstrate the effectiveness of dmel in achieving high performance on both speech synthesis and recognition tasks within a unified framework, paving the way for efficient and effective joint modeling of speech and text.

arxiv情報

著者	Richard He Bai,Tatiana Likhomanenko,Ruixiang Zhang,Zijin Gu,Zakaria Aldeneh,Navdeep Jaitly
発行日	2025-05-21 16:55:34+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context Retrieval

投稿日: 2025年5月22日作成者: jarxiv

要約

大規模な言語モデル（LLM）は、侵入攻撃に対して脆弱であることが知られており、敵は有害または非倫理的な反応を誘発するために慎重に設計されたプロンプトを活用します。
このような脅威は、実際の展開におけるLLMの安全性と信頼性に関する重要な懸念を提起しました。
既存の防衛メカニズムはそのようなリスクを部分的に軽減しますが、敵対的な技術のその後の進歩により、新しい脱獄方法がこれらの保護を回避し、静的防衛枠組みの制限を明らかにしました。
この作業では、コンテキスト検索のレンズを通じて進化する刑務所の脅威に対する防御を探ります。
第一に、特定の脱獄に対して最小限の安全整列例でさえ、この攻撃パターンに対する堅牢性を大幅に高めることができることを実証する予備研究を実施します。
この洞察に基づいて、私たちはさらに検索された生成（RAG）テクニックを活用し、安全性コンテキスト検索（SCR）を提案します。
当社の包括的な実験は、SCRが確立された脱獄戦術と新興の両方の戦術の両方に対して優れた防御パフォーマンスを達成し、LLMの安全に新しいパラダイムを提供する方法を示しています。
私たちのコードは公開時に利用可能になります。

要約(オリジナル)

Large Language Models (LLMs) are known to be vulnerable to jailbreaking attacks, wherein adversaries exploit carefully engineered prompts to induce harmful or unethical responses. Such threats have raised critical concerns about the safety and reliability of LLMs in real-world deployment. While existing defense mechanisms partially mitigate such risks, subsequent advancements in adversarial techniques have enabled novel jailbreaking methods to circumvent these protections, exposing the limitations of static defense frameworks. In this work, we explore defending against evolving jailbreaking threats through the lens of context retrieval. First, we conduct a preliminary study demonstrating that even a minimal set of safety-aligned examples against a particular jailbreak can significantly enhance robustness against this attack pattern. Building on this insight, we further leverage the retrieval-augmented generation (RAG) techniques and propose Safety Context Retrieval (SCR), a scalable and robust safeguarding paradigm for LLMs against jailbreaking. Our comprehensive experiments demonstrate how SCR achieves superior defensive performance against both established and emerging jailbreaking tactics, contributing a new paradigm to LLM safety. Our code will be available upon publication.

arxiv情報

著者	Taiye Chen,Zeming Wei,Ang Li,Yisen Wang
発行日	2025-05-21 16:58:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.CR, cs.LG | コメントを受け付けていません

Improving planning and MBRL with temporally-extended actions

投稿日: 2025年5月22日作成者: jarxiv

要約

連続時間システムは、多くの場合、個別のタイムダイナミクスを使用してモデル化されますが、これには精度を維持するために小さなシミュレーションステップが必要です。
次に、これには、計算的に厳しい計画の問題とパフォーマンスの低下につながる大規模な計画地平線が必要です。
モデルの自由補強学習における以前の研究は、離散アクション期間を決定するためにポリシーが学習されるアクションリピートを使用して、この問題に部分的に対処されています。
代わりに、一時的に拡張されたアクションを使用し、プランナーにアクションの持続時間を標準のアクション変数とともに追加の最適化変数として扱うことにより、連続決定タイムスケールを直接制御することを提案します。
この追加構造には複数の利点があります。
軌跡のシミュレーション時間を高速化し、重要なことに、プランナーで浅い検索深さを使用しながら、原始的なアクションの観点から深い地平線検索を可能にすることです。
さらに、モデルベースの強化学習（MBRL）設定では、モデル学習から複合エラーを減らし、モデルのトレーニング時間を改善します。
このアイデアは効果的であり、アクション期間の範囲を、マルチアームの盗賊策定を使用して自動的に選択し、MBRLフレームワークに統合できることを示します。
計画とMBRLの両方での広範な実験的評価は、私たちのアプローチがより速い計画、より良い解決策をもたらし、標準式で解決されていない問題に対する解決策を可能にすることを示しています。

要約(オリジナル)

Continuous time systems are often modeled using discrete time dynamics but this requires a small simulation step to maintain accuracy. In turn, this requires a large planning horizon which leads to computationally demanding planning problems and reduced performance. Previous work in model free reinforcement learning has partially addressed this issue using action repeats where a policy is learned to determine a discrete action duration. Instead we propose to control the continuous decision timescale directly by using temporally-extended actions and letting the planner treat the duration of the action as an additional optimization variable along with the standard action variables. This additional structure has multiple advantages. It speeds up simulation time of trajectories and, importantly, it allows for deep horizon search in terms of primitive actions while using a shallow search depth in the planner. In addition, in the model based reinforcement learning (MBRL) setting, it reduces compounding errors from model learning and improves training time for models. We show that this idea is effective and that the range for action durations can be automatically selected using a multi-armed bandit formulation and integrated into the MBRL framework. An extensive experimental evaluation both in planning and in MBRL, shows that our approach yields faster planning, better solutions, and that it enables solutions to problems that are not solved in the standard formulation.

arxiv情報

著者	Palash Chatterjee,Roni Khardon
発行日	2025-05-21 16:59:32+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

How Managers Perceive AI-Assisted Conversational Training for Workplace Communication

投稿日: 2025年5月22日作成者: jarxiv

要約

効果的な職場コミュニケーションは管理の成功に不可欠ですが、多くのマネージャーは、テーラードと持続的なトレーニングへのアクセスがありません。
AIアシスト通信システムはスケーラブルなトレーニングソリューションを提供する可能性がありますが、マネージャーがコミュニケーションスキルの向上を支援する上でAIの役割をどのように想定するかについてはほとんど知られていません。
これを調査するために、マネージャーがAIを使用してコミュニケーションスキルを練習することをどのように予想するかを理解するための機能的なプローブとして、会話のロールプレイシステムCommcoachを設計しました。
半構造化されたインタビューを通じて、参加者は、難しい職場での会話を実践するための適応的で低リスクシミュレーションの価値を強調しました。
彼らはまた、人間のチームのチーム化、透明性とコンテキストを意識するフィードバック、AI生成されたペルソナのより大きな制御など、機会を強調しました。
AIアシストされたコミュニケーショントレーニングは、パーソナライズ、構造化された学習目標、およびさまざまなユーザースタイルとコンテキストへの適応性のバランスをとる必要があります。
ただし、これを達成するには、適応的および一貫したAIフィードバック、リアリズムと潜在的なバイアス、およびAIの会話と構造化された職場の談話の自由な性質の間の緊張を慎重にナビゲートする必要があります。

要約(オリジナル)

Effective workplace communication is essential for managerial success, yet many managers lack access to tailored and sustained training. Although AI-assisted communication systems may offer scalable training solutions, little is known about how managers envision the role of AI in helping them improve their communication skills. To investigate this, we designed a conversational role-play system, CommCoach, as a functional probe to understand how managers anticipate using AI to practice their communication skills. Through semi-structured interviews, participants emphasized the value of adaptive, low-risk simulations for practicing difficult workplace conversations. They also highlighted opportunities, including human-AI teaming, transparent and context-aware feedback, and greater control over AI-generated personas. AI-assisted communication training should balance personalization, structured learning objectives, and adaptability to different user styles and contexts. However, achieving this requires carefully navigating tensions between adaptive and consistent AI feedback, realism and potential bias, and the open-ended nature of AI conversations versus structured workplace discourse.

arxiv情報

著者	Lance T. Wilhelm,Xiaohan Ding,Kirk McInnis Knutsen,Buse Carik,Eugenia H. Rho
発行日	2025-05-21 16:59:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.HC | コメントを受け付けていません

SWE-smith: Scaling Data for Software Engineering Agents

投稿日: 2025年5月22日作成者: jarxiv

要約

ソフトウェアエンジニアリングの言語モデル（LMS）の最近の進歩にもかかわらず、トレーニングデータを収集することは依然として重要な問題です。
既存のデータセットは小さく、11以下のGitHubリポジトリから最大1,000のトレーニングインスタンスがあります。
このようなデータセットをキュレートする手順はしばしば複雑であり、数百時間の人間の労働を必要とします。
コンパニオンの実行環境も、スケーラビリティと使いやすさを厳しく制限し、数テラバイトのストレージを取り上げます。
この問題点に対処するために、SWE-SMITHを紹介します。SWE-Smithは、大規模にソフトウェアエンジニアリングトレーニングデータを生成するための新しいパイプラインを紹介します。
Pythonコードベースを考慮して、SWE-SMITHは対応する実行環境を構築し、コードベースの既存のテストを破る100〜1,000のタスクインスタンスを自動的に合成します。
SWE-SMITHを使用して、128のGitHubリポジトリから供給された50Kインスタンスのデータセットを作成します。
SWE-Agent-LM-32Bをトレーニングし、SWEベンチ検証ベンチマークで40.2％パス@1解決レートを達成しました。これは、オープンソースモデルの最先端です。
自動ソフトウェアエンジニアリングのためのLMシステムでの研究の障壁を下げるために、SWE-SWESMITH（収集手順、タスクインスタンス、軌跡、モデル）をオープンします。
https://swesmith.comで利用可能なすべての資産。

要約(オリジナル)

Despite recent progress in Language Models (LMs) for software engineering, collecting training data remains a significant pain point. Existing datasets are small, with at most 1,000s of training instances from 11 or fewer GitHub repositories. The procedures to curate such datasets are often complex, necessitating hundreds of hours of human labor; companion execution environments also take up several terabytes of storage, severely limiting their scalability and usability. To address this pain point, we introduce SWE-smith, a novel pipeline for generating software engineering training data at scale. Given any Python codebase, SWE-smith constructs a corresponding execution environment, then automatically synthesizes 100s to 1,000s of task instances that break existing test(s) in the codebase. Using SWE-smith, we create a dataset of 50k instances sourced from 128 GitHub repositories, an order of magnitude larger than all previous works. We train SWE-agent-LM-32B, achieving 40.2% Pass@1 resolve rate on the SWE-bench Verified benchmark, state of the art among open source models. We open source SWE-smith (collection procedure, task instances, trajectories, models) to lower the barrier of entry for research in LM systems for automated software engineering. All assets available at https://swesmith.com.

arxiv情報

著者	John Yang,Kilian Leret,Carlos E. Jimenez,Alexander Wettig,Kabir Khandpur,Yanzhe Zhang,Binyuan Hui,Ofir Press,Ludwig Schmidt,Diyi Yang
発行日	2025-05-21 17:21:45+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.SE | コメントを受け付けていません

Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space

投稿日: 2025年5月22日作成者: jarxiv

要約

人間の認知は、通常、個別の言語トークンを厳密に使用するのではなく、抽象的で流動的な概念を通して考えることを伴います。
ただし、現在の推論モデルは、人間の言語の境界内で推論に制約され、セマンティックスペースの固定点を表す離散トークンエンミングを処理します。
この個別の制約は、標準的なチェーンオブシェア（COT）メソッドがステップごとに1つのトークンのサンプリングに依存することに依存しているため、このような推論モデルの表現力と上位の可能性を制限し、推論パスの不完全な調査を引き起こすことがよくあります。
この作業では、連続コンセプト空間でソフトで抽象的な概念トークンを生成することにより、人間のような「ソフト」な推論をエミュレートするトレーニングなしの方法であるソフト思考を紹介します。
これらのコンセプトトークンは、トークン埋め込みの確率加重された混合物によって作成され、連続概念空間を形成し、従来の個別の境界を超越する滑らかな遷移とより豊富な表現を可能にします。
本質的に、生成された各概念トークンは、関連する離散トークンからの複数の意味をカプセル化し、暗黙的にさまざまな推論パスを検討して、正解に効果的に収束します。
多様な数学とコーディングのベンチマークに関する経験的評価は、ソフト思考の有効性と効率性を一貫して実証し、パス@1の精度を最大2.48ポイント改善し、同時に標準のCOTと比較してトークンの使用量を最大22.4％削減します。
定性分析は、ソフト思考の出力が非常に解釈可能で読みやすいままであり、個別の言語ベースの推論の固有のボトルネックを破るソフト思考の可能性を強調していることをさらに明らかにしています。
コードはhttps://github.com/eric-ai-lab/soft-thinkingで入手できます。

要約(オリジナル)

Human cognition typically involves thinking through abstract, fluid concepts rather than strictly using discrete linguistic tokens. Current reasoning models, however, are constrained to reasoning within the boundaries of human language, processing discrete token embeddings that represent fixed points in the semantic space. This discrete constraint restricts the expressive power and upper potential of such reasoning models, often causing incomplete exploration of reasoning paths, as standard Chain-of-Thought (CoT) methods rely on sampling one token per step. In this work, we introduce Soft Thinking, a training-free method that emulates human-like ‘soft’ reasoning by generating soft, abstract concept tokens in a continuous concept space. These concept tokens are created by the probability-weighted mixture of token embeddings, which form the continuous concept space, enabling smooth transitions and richer representations that transcend traditional discrete boundaries. In essence, each generated concept token encapsulates multiple meanings from related discrete tokens, implicitly exploring various reasoning paths to converge effectively toward the correct answer. Empirical evaluations on diverse mathematical and coding benchmarks consistently demonstrate the effectiveness and efficiency of Soft Thinking, improving pass@1 accuracy by up to 2.48 points while simultaneously reducing token usage by up to 22.4% compared to standard CoT. Qualitative analysis further reveals that Soft Thinking outputs remain highly interpretable and readable, highlighting the potential of Soft Thinking to break the inherent bottleneck of discrete language-based reasoning. Code is available at https://github.com/eric-ai-lab/Soft-Thinking.

arxiv情報

著者	Zhen Zhang,Xuehai He,Weixiang Yan,Ao Shen,Chenyang Zhao,Shuohang Wang,Yelong Shen,Xin Eric Wang
発行日	2025-05-21 17:29:15+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Large Language Models as Computable Approximations to Solomonoff Induction

投稿日: 2025年5月22日作成者: jarxiv

要約

大規模な言語モデル（LLMS）の急速な進歩は、経験的な成功を説明するために厳格な理論的枠組みを求めています。
LLMの動作を理解する際に大きな進歩がありましたが、既存の理論的フレームワークは、統一された数学レンズを介して緊急現象を説明する際に断片化されたままです。
2つの基本的な結果を証明することにより、LLMアーキテクチャとアルゴリズム情報理論（AIT）の間の最初の正式なつながりを確立します。（1）トレーニングプロセスは、プログラム長の最適化として解釈される損失最小化を通じてソロモノフを計算することを計算します。
AITを活用して、コンテキスト内学習、少数のショット学習、およびスケーリング法の統一された理論的説明を提供します。
さらに、私たちの理論的洞察は、モデルがより低い予測信頼性を示すサンプルに優先順位を付ける、少数のショット例選択の原則的な方法につながります。
多様なテキスト分類ベンチマークに関する実験を通じて、この戦略が、特に高いモデルアーキテクチャに対して、高い自信の例を選択するのと比較して、大幅なパフォーマンスの改善をもたらすことを実証します。
私たちのフレームワークは、理論的基礎と実用的なLLM行動の間のギャップを埋め、将来のモデル開発のための説明力と実用的な洞察の両方を提供します。

要約(オリジナル)

The rapid advancement of large language models (LLMs) calls for a rigorous theoretical framework to explain their empirical success. While significant progress has been made in understanding LLM behaviors, existing theoretical frameworks remain fragmented in explaining emergent phenomena through a unified mathematical lens. We establish the first formal connection between LLM architectures and Algorithmic Information Theory (AIT) by proving two fundamental results: (1) the training process computationally approximates Solomonoff prior through loss minimization interpreted as program length optimization, and (2) next-token prediction implements approximate Solomonoff induction. We leverage AIT to provide a unified theoretical explanation for in-context learning, few-shot learning, and scaling laws. Furthermore, our theoretical insights lead to a principled method for few-shot example selection that prioritizes samples where models exhibit lower predictive confidence. We demonstrate through experiments on diverse text classification benchmarks that this strategy yields significant performance improvements, particularly for smaller model architectures, when compared to selecting high-confidence examples. Our framework bridges the gap between theoretical foundations and practical LLM behaviors, providing both explanatory power and actionable insights for future model development.

arxiv情報

著者	Jun Wan,Lingrui Mei
発行日	2025-05-21 17:35:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment

投稿日: 2025年5月22日作成者: jarxiv

要約

GPT-4などの大規模な言語モデルの高い計算コストと遅延により、臨床環境での展開が制限されています。
小言語モデル（SLMS）は費用対効果の高い代替品を提供しますが、容量が限られているには生物医学的ドメインの適応が必要であり、依然として困難です。
追加のボトルネックは、臨床データの利用不能と高い感度です。
これらの課題に対処するために、SLMを高性能な臨床モデルに適応させるための新しいフレームワークを提案します。
私たちの新しいフレームワークで開発された3.8BパラメーターSLMのMediphiコレクションを紹介します。関連する医療および臨床コーパス（PMC、医療ガイドライン、Medwikiなど）、モデルのマージ、および臨床タスクの調整に関する専門家のインストラクション前調整です。
ほとんどの臨床タスクをカバーするために、手がかりのベンチマークを手がかり+に拡張し、そのサイズを2倍にしました。
当社のエキスパートモデルは、タスク固有の微調整なしでベースモデル上のこのベンチマークの相対的な改善を提供します：医療機関で64.3％、放射線レポートで49.5％、ICD-10コーディングで44％（GPT-4-0125を14％上回る）。
モデルの合併を介して専門家モデルをMediphiに統一し、ベンチマーク全体で利益を維持します。
さらに、Mediflowコレクション、14の医療NLPタスクに関する250万の高品質の命令の合成データセット、98の微調整されたドキュメントタイプ、およびJSON形式のサポートを構築しました。
監視された微調整と直接選好の最適化を使用したMediphiのアライメントは、平均で18.9％のさらなる利益を達成します。

要約(オリジナル)

High computation costs and latency of large language models such as GPT-4 have limited their deployment in clinical settings. Small language models (SLMs) offer a cost-effective alternative, but their limited capacity requires biomedical domain adaptation, which remains challenging. An additional bottleneck is the unavailability and high sensitivity of clinical data. To address these challenges, we propose a novel framework for adapting SLMs into high-performing clinical models. We introduce the MediPhi collection of 3.8B-parameter SLMs developed with our novel framework: pre-instruction tuning of experts on relevant medical and clinical corpora (PMC, Medical Guideline, MedWiki, etc.), model merging, and clinical-tasks alignment. To cover most clinical tasks, we extended the CLUE benchmark to CLUE+, doubling its size. Our expert models deliver relative improvements on this benchmark over the base model without any task-specific fine-tuning: 64.3% on medical entities, 49.5% on radiology reports, and 44% on ICD-10 coding (outperforming GPT-4-0125 by 14%). We unify the expert models into MediPhi via model merging, preserving gains across benchmarks. Furthermore, we built the MediFlow collection, a synthetic dataset of 2.5 million high-quality instructions on 14 medical NLP tasks, 98 fine-grained document types, and JSON format support. Alignment of MediPhi using supervised fine-tuning and direct preference optimization achieves further gains of 18.9% on average.

arxiv情報

著者	Jean-Philippe Corbeil,Amin Dada,Jean-Michel Attendu,Asma Ben Abacha,Alessandro Sordoni,Lucas Caccia,François Beaulieu,Thomas Lin,Jens Kleesiek,Paul Vozila
発行日	2025-05-21 17:36:21+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Spatiotemporal Field Generation Based on Hybrid Mamba-Transformer with Physics-informed Fine-tuning

投稿日: 2025年5月22日作成者: jarxiv

要約

この研究は、データ駆動型の訓練されたモデルを通じて時空間的物理フィールドの生成で遭遇する実質的な物理的方程式の矛盾の課題に直面しています。
HMT-PFという名前の時空の物理フィールド生成モデルは、非構造化グリッド情報を入力として組み込んだハイブリッドMAMBA-TRANSFORMERアーキテクチャに基づいて開発されています。
物理情報で強化された微調整ブロックが、物理的方程式の矛盾を効果的に減らすために導入されます。
物理方程式残差は、効率的な勾配評価のためにポイントクエリメカニズムを通じて計算され、洗練された潜在空間にエンコードされます。
微調整プロセスでは、自己監視された学習アプローチを採用して、必須のフィールド特性を維持しながら、身体的一貫性を実現します。
結果は、ハイブリッドマンバ変換モデルが時空間フィールドの生成において優れたパフォーマンスを達成し、物理学に基づいた微調整メカニズムが重要な物理的エラーを効果的に減少させることを示しています。
MSE-R評価方法は、物理フィールド生成の精度とリアリズムを評価するために開発されています。

要約(オリジナル)

This research confronts the challenge of substantial physical equation discrepancies encountered in the generation of spatiotemporal physical fields through data-driven trained models. A spatiotemporal physical field generation model, named HMT-PF, is developed based on the hybrid Mamba-Transformer architecture, incorporating unstructured grid information as input. A fine-tuning block, enhanced with physical information, is introduced to effectively reduce the physical equation discrepancies. The physical equation residuals are computed through a point query mechanism for efficient gradient evaluation, then encoded into latent space for refinement. The fine-tuning process employs a self-supervised learning approach to achieve physical consistency while maintaining essential field characteristics. Results show that the hybrid Mamba-Transformer model achieves good performance in generating spatiotemporal fields, while the physics-informed fine-tuning mechanism further reduces significant physical errors effectively. A MSE-R evaluation method is developed to assess the accuracy and realism of physical field generation.

arxiv情報

著者	Peimian Du,Jiabin Liu,Xiaowei Jin,Mengwang Zuo,Hui Li
発行日	2025-05-21 17:36:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG, physics.comp-ph | コメントを受け付けていません

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント