jarxiv | Japanese arxiv | ページ 1615

FetDTIAlign: A Deep Learning Framework for Affine and Deformable Registration of Fetal Brain dMRI

投稿日: 2025年2月25日作成者: jarxiv

要約

拡散MRI（DMRI）は、子宮内の胎児脳微細構造に関するユニークな洞察を提供します。
縦方向および横断的胎児DMRI研究は、重要な神経発達の変化を明らかにすることができますが、スキャンと被験者間の正確な空間的アライメントが必要です。
これは、データ品質の低さ、急速な脳の発達、および解剖学的ランドマークが限られているため、困難です。
高品質のアダルトデータ向けに設計された既存の登録方法は、これらの複雑さに苦しんでいます。
これに対処するために、胎児の脳DMRI登録の深い学習アプローチであるFetdtialignを紹介し、正確なアフィンと変形可能なアライメントを可能にします。
fetdtialignは、デュアルエンコーダーアーキテクチャと反復機能ベースの推論を備えており、ノイズと低解像度の影響を減らします。
各登録段階でネットワーク構成とドメイン固有の機能を最適化し、堅牢性と精度の両方を向上させます。
23から36週の妊娠のデータに関するFetdtialignを検証し、60の白質路をカバーしました。
それは一貫して2つの古典的な最適化ベースの方法と深い学習パイプラインを上回り、優れた解剖学的対応を達成しました。
発展途上のHuman Connectomeプロジェクトの外部データのさらなる検証により、取得プロトコル全体の一般化可能性が確認されました。
我々の結果は、胎児の脳DMRI登録の深い学習の実現可能性を示しており、古典的な手法に代わるより正確で信頼できる代替品を提供します。
正確なクロスサブジェクトおよびトラクト固有の分析を可能にすることにより、fetdtialignは初期の脳の発達における新しい発見をサポートします。

要約(オリジナル)

Diffusion MRI (dMRI) provides unique insights into fetal brain microstructure in utero. Longitudinal and cross-sectional fetal dMRI studies can reveal crucial neurodevelopmental changes but require precise spatial alignment across scans and subjects. This is challenging due to low data quality, rapid brain development, and limited anatomical landmarks. Existing registration methods, designed for high-quality adult data, struggle with these complexities. To address this, we introduce FetDTIAlign, a deep learning approach for fetal brain dMRI registration, enabling accurate affine and deformable alignment. FetDTIAlign features a dual-encoder architecture and iterative feature-based inference, reducing the impact of noise and low resolution. It optimizes network configurations and domain-specific features at each registration stage, enhancing both robustness and accuracy. We validated FetDTIAlign on data from 23 to 36 weeks gestation, covering 60 white matter tracts. It consistently outperformed two classical optimization-based methods and a deep learning pipeline, achieving superior anatomical correspondence. Further validation on external data from the Developing Human Connectome Project confirmed its generalizability across acquisition protocols. Our results demonstrate the feasibility of deep learning for fetal brain dMRI registration, providing a more accurate and reliable alternative to classical techniques. By enabling precise cross-subject and tract-specific analyses, FetDTIAlign supports new discoveries in early brain development.

arxiv情報

著者	Bo Li,Qi Zeng,Simon K. Warfield,Davood Karimi
発行日	2025-02-24 17:55:45+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, eess.IV | コメントを受け付けていません

Score Change of Variables

投稿日: 2025年2月25日作成者: jarxiv

要約

スコア関数の変数式の一般的な変更を導き出します。スムーズで反転可能な変換$ \ mathbf {y} = \ phi（\ mathbf {x}）$を示します。
}} \ log q（\ mathbf {y}）$は、直接表現できます
$ \ nabla _ {\ mathbf {x}} \ log p（\ mathbf {x}）$。
この結果を使用して、2つのアプリケーションを開発します。まず、スコアベースの拡散モデルに逆タイムIT \^o lemmaを確立し、$ \ nabla _ {\ mathbf {x}} \ log p_t（\ mathbf {
x}）$ $ \ nabla _ {\ mathbf {y}}を直接学習せずに、変換された空間のSDEを逆にする
\ log q_t（\ mathbf {y}）$。
このアプローチにより、あるスペースで拡散モデルをトレーニングすることができますが、別のスペースでサンプリングし、フォワードプロセスと逆のプロセスを効果的に切り離します。
第二に、一般化されたスライススコアマッチングを導入し、従来のスライススコアマッチングを線形投影から任意の滑らかな変換に拡張します。
これにより、高次元密度の推定における柔軟性が向上します。
これらの理論的進歩をアプリケーションを通じて実証して、単純な確率で拡散し、一般化されたスコアマッチングアプローチを従来のスライススコアマッチング方法と経験的に比較します。

要約(オリジナル)

We derive a general change of variables formula for score functions, showing that for a smooth, invertible transformation $\mathbf{y} = \phi(\mathbf{x})$, the transformed score function $\nabla_{\mathbf{y}} \log q(\mathbf{y})$ can be expressed directly in terms of $\nabla_{\mathbf{x}} \log p(\mathbf{x})$. Using this result, we develop two applications: First, we establish a reverse-time It\^o lemma for score-based diffusion models, allowing the use of $\nabla_{\mathbf{x}} \log p_t(\mathbf{x})$ to reverse an SDE in the transformed space without directly learning $\nabla_{\mathbf{y}} \log q_t(\mathbf{y})$. This approach enables training diffusion models in one space but sampling in another, effectively decoupling the forward and reverse processes. Second, we introduce generalized sliced score matching, extending traditional sliced score matching from linear projections to arbitrary smooth transformations. This provides greater flexibility in high-dimensional density estimation. We demonstrate these theoretical advances through applications to diffusion on the probability simplex and empirically compare our generalized score matching approach against traditional sliced score matching methods.

arxiv情報

著者	Stephen Robbins
発行日	2025-02-24 17:56:03+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: 68T01, cs.AI, cs.LG, I.2.6, math.PR | コメントを受け付けていません

Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation

投稿日: 2025年2月25日作成者: jarxiv

要約

言語の多様性は、自動音声認識や翻訳など、音声からテキスト（S2T）タスクに大きな課題を提示します。
従来のマルチタスクトレーニングアプローチは、さまざまな言語で複数の音声認識と翻訳タスクを共同で最適化することにより、これに対処することを目的としています。
これらの戦略に基づいて構築されたささやきのようなモデルは、強力なパフォーマンスを示していますが、高い計算コスト、言語干渉、最適ではないトレーニング構成、および限られた拡張性の問題に依然として直面しています。
これらの課題を克服するために、パフォーマンスを維持し、計算オーバーヘッドを削減しながら、さまざまな言語やタスクでトレーニングされたモデルを効率的に統合するように設計された新しい手法である、Lors-Merging（低ランクおよびスパースモデルのマージ）を紹介します。
Lors-Mergingは、低ランクとまばらな剪定を組み合わせて、冗長なパラメーターを排除し、言語とタスクの干渉を緩和し、拡張性を向上させながら、必須構造を保持します。
さまざまな言語にわたる実験結果は、ローマーが従来のマルチリングのマルチタスクトレーニングベースラインを大幅に上回ることを示しています。
我々の調査結果は、モデルのマージ、特にローマーマザーが、S2Tアプリケーションの従来の多言語トレーニング戦略をスケーラブルで効果的な補完であることを示唆しています。

要約(オリジナル)

Language diversity presents a significant challenge in speech-to-text (S2T) tasks, such as automatic speech recognition and translation. Traditional multi-task training approaches aim to address this by jointly optimizing multiple speech recognition and translation tasks across various languages. While models like Whisper, built on these strategies, demonstrate strong performance, they still face issues of high computational cost, language interference, suboptimal training configurations, and limited extensibility. To overcome these challenges, we introduce LoRS-Merging (low-rank and sparse model merging), a novel technique designed to efficiently integrate models trained on different languages or tasks while preserving performance and reducing computational overhead. LoRS-Merging combines low-rank and sparse pruning to retain essential structures while eliminating redundant parameters, mitigating language and task interference, and enhancing extensibility. Experimental results across a range of languages demonstrate that LoRS-Merging significantly outperforms conventional multi-lingual multi-task training baselines. Our findings suggest that model merging, particularly LoRS-Merging, is a scalable and effective complement to traditional multi-lingual training strategies for S2T applications.

arxiv情報

著者	Qiuming Zhao,Guangzhi Sun,Chao Zhang,Mingxing Xu,Thomas Fang Zheng
発行日	2025-02-24 18:06:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models

投稿日: 2025年2月25日作成者: jarxiv

要約

推論モデルへの関心の高まりにより、数学はアルゴリズムと方法論の改善の顕著なテスト場になりました。
ただし、既存のオープン数学データセットには、高品質で人間が書かれた問題の小さなコレクションまたは不確実な品質の機械で生成された問題の大規模なコーパスが含まれているため、研究者に品質と量を選択することが強制されています。
この作業では、検証可能な回答を含む250,000を超える高品質の数学の質問のデータセットであるBig-Mathを、強化学習（RL）のために意図的に作成します。
Big-Mathを作成するために、私たちは厳密にフィルタリング、クリーン、キュレートし、オープンに利用可能なデータセットを抽出し、3つのDesiderataを満たす質問を抽出します。
閉じたソリューション付き。
Big-Mathの品質を確保するために、フィルタリングプロセスの各ステップを手動で検証します。
フィルタリングプロセスからの調査結果に基づいて、系統的な再編成アルゴリズムを通じて自由回答形式の質問として再定式化された、Big-Math-reformulated：Big-Math-Endedの質問（つまり、複数選択の質問）を検証した回答を含む47,000の新しい質問を紹介します。
数学の推論で最も一般的に使用されている既存のオープンソースデータセット、GSM8K、および数学と比較して、Big-Mathは数桁大きくなりますが、厳密なフィルタリングにより、RLに最も適した質問を維持することが保証されます。
また、データセットの厳密な分析を提供し、Big-Mathには問題ドメイン全体の高度な多様性が含まれており、さまざまな機能とトレーニング要件のモデルに幅広いダウンストリーム使用を可能にすることができます。
データの品質と数量の間のギャップを埋めることにより、Big-MathはLLMで推論を進めるための堅牢な基盤を確立します。

要約(オリジナル)

Increasing interest in reasoning models has led math to become a prominent testing ground for algorithmic and methodological improvements. However, existing open math datasets either contain a small collection of high-quality, human-written problems or a large corpus of machine-generated problems of uncertain quality, forcing researchers to choose between quality and quantity. In this work, we present Big-Math, a dataset of over 250,000 high-quality math questions with verifiable answers, purposefully made for reinforcement learning (RL). To create Big-Math, we rigorously filter, clean, and curate openly available datasets, extracting questions that satisfy our three desiderata: (1) problems with uniquely verifiable solutions, (2) problems that are open-ended, (3) and problems with a closed-form solution. To ensure the quality of Big-Math, we manually verify each step in our filtering process. Based on the findings from our filtering process, we introduce 47,000 new questions with verified answers, Big-Math-Reformulated: closed-ended questions (i.e. multiple choice questions) that have been reformulated as open-ended questions through a systematic reformulation algorithm. Compared to the most commonly used existing open-source datasets for math reasoning, GSM8k and MATH, Big-Math is an order of magnitude larger, while our rigorous filtering ensures that we maintain the questions most suitable for RL. We also provide a rigorous analysis of the dataset, finding that Big-Math contains a high degree of diversity across problem domains, and incorporates a wide range of problem difficulties, enabling a wide range of downstream uses for models of varying capabilities and training requirements. By bridging the gap between data quality and quantity, Big-Math establish a robust foundation for advancing reasoning in LLMs.

arxiv情報

著者	Alon Albalak,Duy Phung,Nathan Lile,Rafael Rafailov,Kanishk Gandhi,Louis Castricato,Anikait Singh,Chase Blagden,Violet Xiang,Dakota Mahan,Nick Haber
発行日	2025-02-24 18:14:01+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Learning to Reason at the Frontier of Learnability

投稿日: 2025年2月25日作成者: jarxiv

要約

強化学習は現在、特に数学の問題などの推論スタイルのタスクについて、大規模な言語モデルトレーニングの最終段階として広く採用されています。
通常、モデルは、単一のトレーニングステップ中に各質問を何度も試み、成功と失敗から学びます。
ただし、2つの広く使用されているデータセットでの2つの一般的なアルゴリズム（PPOとVineppo）を使用したトレーニング全体で、多くの質問がすべての試みによって解決されることを示しています。
これに対処するために、補強学習文献からの方法を適応させます – 学習性のためのサンプリング – を適用し、LLMトレーニングの強化学習段階に適用します。
私たちのカリキュラムは、成功の高いばらつきのある質問、つまりエージェントが成功することがありますが、常にではありませんが、質問を優先します。
私たちの調査結果は、このカリキュラムが複数のアルゴリズムとデータセットにわたってトレーニングパフォーマンスを一貫して向上させ、LLMを使用したより効率的で効果的な強化学習への道を開いていることを示しています。

要約(オリジナル)

Reinforcement learning is now widely adopted as the final stage of large language model training, especially for reasoning-style tasks such as maths problems. Typically, models attempt each question many times during a single training step and attempt to learn from their successes and failures. However, we demonstrate that throughout training with two popular algorithms (PPO and VinePPO) on two widely used datasets, many questions are either solved by all attempts – meaning they are already learned – or by none – providing no meaningful training signal. To address this, we adapt a method from the reinforcement learning literature – sampling for learnability – and apply it to the reinforcement learning stage of LLM training. Our curriculum prioritises questions with high variance of success, i.e. those where the agent sometimes succeeds, but not always. Our findings demonstrate that this curriculum consistently boosts training performance across multiple algorithms and datasets, paving the way for more efficient and effective reinforcement learning with LLMs.

arxiv情報

著者	Thomas Foster,Jakob Foerster
発行日	2025-02-24 18:15:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

The Empirical Impact of Reducing Symmetries on the Performance of Deep Ensembles and MoE

投稿日: 2025年2月25日作成者: jarxiv

要約

最近の研究では、ニューラルネットワークの対称性を減らすことで、パラメータースペースアライメントを必要とせずにネットワーク間の線形モードの接続性が向上し、線形補間ニューラルネットワークのパフォーマンスが向上することが示されています。
ただし、実際のアプリケーションでは、ニューラルネットワークの補間はめったに使用されません。
代わりに、ネットワークのアンサンブルがより一般的です。
このホワイトペーパーでは、5つのデータセットにわたるディープアンサンブルと専門家（MOE）の混合物のパフォーマンスに対する対称性を減らすことの影響を経験的に調査します。
さらに、より深い線形モードの接続性を調査するために、補間された専門家（MOIE）の混合物を紹介します。
私たちの結果は、非対称ニューラルネットワーク上に構築された深いアンサンブルが、対称的な対応物と比較してアンサンブルサイズが増加するにつれて、パフォーマンスが大幅に向上することを示しています。
対照的に、私たちの実験は、対称性を減らすことがMOEとMoieの建築の両方に影響するかどうかについての決定的な証拠を提供しません。

要約(オリジナル)

Recent studies have shown that reducing symmetries in neural networks enhances linear mode connectivity between networks without requiring parameter space alignment, leading to improved performance in linearly interpolated neural networks. However, in practical applications, neural network interpolation is rarely used; instead, ensembles of networks are more common. In this paper, we empirically investigate the impact of reducing symmetries on the performance of deep ensembles and Mixture of Experts (MoE) across five datasets. Additionally, to explore deeper linear mode connectivity, we introduce the Mixture of Interpolated Experts (MoIE). Our results show that deep ensembles built on asymmetric neural networks achieve significantly better performance as ensemble size increases compared to their symmetric counterparts. In contrast, our experiments do not provide conclusive evidence on whether reducing symmetries affects both MoE and MoIE architectures.

arxiv情報

著者	Andrei Chernov,Oleg Novitskij
発行日	2025-02-24 18:16:23+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Emoti-Attack: Zero-Perturbation Adversarial Attacks on NLP Systems via Emoji Sequences

投稿日: 2025年2月25日作成者: jarxiv

要約

Deep Neural Networks（DNNS）は、Natural Language Processing（NLP）の分野で顕著な成功を収めており、ChatGPTなどの広く認識されているアプリケーションにつながりました。
ただし、これらのモデルの敵対的な攻撃に対する脆弱性は依然として大きな懸念事項です。
画像のような連続ドメインとは異なり、テキストは個別の空間に存在し、文、単語、または文字レベルで人間が容易に知覚できる文字にわずかな変更を加えます。
この固有の離散性は、テキストが異なる可能性がないため、従来の最適化手法の使用も複雑にします。
テキストの敵対的攻撃に関する以前の研究では、キャラクターレベル、単語レベル、文レベル、およびマルチレベルのアプローチに焦点を当てており、これらはすべて、複数のクエリまたは重要なセマンティックシフトの必要性により、非効率性または知覚可能性の問題に悩まされています。
この作業では、絵文字の操作を活用して微妙で効果的な摂動を作成する新しい敵対的な攻撃方法である絵文字攻撃を紹介します。
キャラクターや単語レベルの戦略とは異なり、emoji-attackは絵文字を攻撃の明確な層としてターゲットにしているため、テキストを最小限に抑えて目立たない変化が発生します。
このアプローチは、以前の研究ではほとんど未開拓であり、通常、キャラクターレベルの攻撃の延長として絵文字の挿入に焦点を当てています。
私たちの実験は、絵文字攻撃が大小のモデルの両方で強い攻撃パフォーマンスを達成し、NLPシステムの敵対的堅牢性を高めるための有望な手法となっていることを示しています。

要約(オリジナル)

Deep neural networks (DNNs) have achieved remarkable success in the field of natural language processing (NLP), leading to widely recognized applications such as ChatGPT. However, the vulnerability of these models to adversarial attacks remains a significant concern. Unlike continuous domains like images, text exists in a discrete space, making even minor alterations at the sentence, word, or character level easily perceptible to humans. This inherent discreteness also complicates the use of conventional optimization techniques, as text is non-differentiable. Previous research on adversarial attacks in text has focused on character-level, word-level, sentence-level, and multi-level approaches, all of which suffer from inefficiency or perceptibility issues due to the need for multiple queries or significant semantic shifts. In this work, we introduce a novel adversarial attack method, Emoji-Attack, which leverages the manipulation of emojis to create subtle, yet effective, perturbations. Unlike character- and word-level strategies, Emoji-Attack targets emojis as a distinct layer of attack, resulting in less noticeable changes with minimal disruption to the text. This approach has been largely unexplored in previous research, which typically focuses on emoji insertion as an extension of character-level attacks. Our experiments demonstrate that Emoji-Attack achieves strong attack performance on both large and small models, making it a promising technique for enhancing adversarial robustness in NLP systems.

arxiv情報

著者	Yangshijie Zhang
発行日	2025-02-24 18:20:18+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.CR | コメントを受け付けていません

FIG: Forward-Inverse Generation for Low-Resource Domain-specific Event Detection

投稿日: 2025年2月25日作成者: jarxiv

要約

イベント検出（ED）は、生物医学、法的、および疫学的領域におけるドメイン固有の推論に役立つ自然言語テキストからの関心のあると入力されたイベント言及を特定するタスクです。
ただし、さまざまなドメインの数千のイベントの監視データを調達することは、面倒で高価な作業です。
この目的のために、既存の作品は、フォワード（ラベルのない文のラベルを生成）および逆（生成されたラベルからの文の生成）世代を介して合成データ生成を調査しました。
ただし、フォワードジェネレーションはしばしばノイズの多いラベルを生成しますが、逆発電はドメインドリフトと不完全なイベント注釈と闘います。
これらの課題に対処するために、高品質のデータ合成のために逆生成を活用するハイブリッドアプローチであるイチジクを導入しながら、無効なターゲットデータでフォワード生成を介して抽出されたドメイン固有のキューに固定します。
イチジクは、前方の生成ベースの改良を通じて欠落した注釈を追加することにより、その合成データをさらに強化します。
多様なドメインからの3つのEDデータセットでの実験により、イチジクはゼロショットおよび少ないショット設定でそれぞれ3.3％F1と5.4％F1の平均ゲインを達成する最高のベースラインを上回ることが明らかになりました。
生成されたトリガーヒット率と人間の評価を分析すると、既存のベースラインと比較して、イチジクの優れたドメインアライメントとデータの品質が実証されます。

要約(オリジナル)

Event Detection (ED) is the task of identifying typed event mentions of interest from natural language text, which benefits domain-specific reasoning in biomedical, legal, and epidemiological domains. However, procuring supervised data for thousands of events for various domains is a laborious and expensive task. To this end, existing works have explored synthetic data generation via forward (generating labels for unlabeled sentences) and inverse (generating sentences from generated labels) generations. However, forward generation often produces noisy labels, while inverse generation struggles with domain drift and incomplete event annotations. To address these challenges, we introduce FIG, a hybrid approach that leverages inverse generation for high-quality data synthesis while anchoring it to domain-specific cues extracted via forward generation on unlabeled target data. FIG further enhances its synthetic data by adding missing annotations through forward generation-based refinement. Experimentation on three ED datasets from diverse domains reveals that FIG outperforms the best baseline achieving average gains of 3.3% F1 and 5.4% F1 in the zero-shot and few-shot settings respectively. Analyzing the generated trigger hit rate and human evaluation substantiates FIG’s superior domain alignment and data quality compared to existing baselines.

arxiv情報

著者	Tanmay Parekh,Yuxuan Dong,Lucas Bandarkar,Artin Kim,I-Hung Hsu,Kai-Wei Chang,Nanyun Peng
発行日	2025-02-24 18:20:42+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Large Language Models are Powerful EHR Encoders

投稿日: 2025年2月25日作成者: jarxiv

要約

電子健康記録（EHR）は臨床的予測の豊富な可能性を提供しますが、それらの固有の複雑さと不均一性は、従来の機械学習アプローチに大きな課題をもたらします。
非標識EHRデータの大規模なコレクションで訓練されたドメイン固有のEHR基礎モデルは、予測精度と一般化の有望な改善を実証しています。
ただし、彼らのトレーニングは、コーディング基準とヘルスケア慣行における多様で高品質のデータセットと矛盾へのアクセスが制限されています。
この研究では、EHRエンコーダーとして汎用の大規模言語モデル（LLMS）ベースの埋め込み方法を使用する可能性を探ります。
患者の記録を構造化されたマークダウンテキストにシリアル化することにより、コードを人間の読み取り可能な記述子に変換することにより、広大な公共のコーパスで前処理されたLLMの広範な一般化能力を活用して、独自の医療データセットの必要性をバイパスします。
2つの最先端のLLM埋め込みモデル、GTE-QWEN2-7B-InstructおよびLLM2VEC-LLAMA3.1-8B-instructをEhrshotベンチマークから15の多様な臨床予測タスクで、パフォーマンスを比較して、パフォーマンスを比較して体系的に評価します。
EHRSpecific Foundationモデル、Crimbr-T-Base、および従来の機械学習ベースライン。
私たちの結果は、LLMベースの埋め込みが、少ないショット設定であっても、特殊なモデルのパフォーマンスに頻繁に一致するか、それを超えることを示しており、その有効性が基礎となるLLMのサイズと利用可能なコンテキストウィンドウのサイズを拡大していることを示しています。
全体として、我々の調査結果は、EHRエンコードのLLMを再利用することで、従来のEHRモデリングの制限を克服し、より操作可能で一般化可能なヘルスケアアプリケーションを促進することができる臨床予測のためのスケーラブルで効果的なアプローチを提供することを示しています。

要約(オリジナル)

Electronic Health Records (EHRs) offer rich potential for clinical prediction, yet their inherent complexity and heterogeneity pose significant challenges for traditional machine learning approaches. Domain-specific EHR foundation models trained on large collections of unlabeled EHR data have demonstrated promising improvements in predictive accuracy and generalization; however, their training is constrained by limited access to diverse, high-quality datasets and inconsistencies in coding standards and healthcare practices. In this study, we explore the possibility of using general-purpose Large Language Models (LLMs) based embedding methods as EHR encoders. By serializing patient records into structured Markdown text, transforming codes into human-readable descriptors, we leverage the extensive generalization capabilities of LLMs pretrained on vast public corpora, thereby bypassing the need for proprietary medical datasets. We systematically evaluate two state-of-the-art LLM-embedding models, GTE-Qwen2-7B-Instruct and LLM2Vec-Llama3.1-8B-Instruct, across 15 diverse clinical prediction tasks from the EHRSHOT benchmark, comparing their performance to an EHRspecific foundation model, CLIMBR-T-Base, and traditional machine learning baselines. Our results demonstrate that LLM-based embeddings frequently match or exceed the performance of specialized models, even in few-shot settings, and that their effectiveness scales with the size of the underlying LLM and the available context window. Overall, our findings demonstrate that repurposing LLMs for EHR encoding offers a scalable and effective approach for clinical prediction, capable of overcoming the limitations of traditional EHR modeling and facilitating more interoperable and generalizable healthcare applications.

arxiv情報

著者	Stefan Hegselmann,Georg von Arnim,Tillmann Rheude,Noel Kronenberg,David Sontag,Gerhard Hindricks,Roland Eils,Benjamin Wild
発行日	2025-02-24 18:30:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Reasoning with Latent Thoughts: On the Power of Looped Transformers

投稿日: 2025年2月25日作成者: jarxiv

要約

大規模な言語モデルは、顕著な推論能力を示しており、スケーリング法則は、特に深さ軸に沿って大きなパラメーター数が主要なドライバーであることを示唆しています。
この作業では、より強力な主張をします – 多くの推論の問題には大きな深さが必要ですが、必ずしも多くのパラメーターが必要ではありません。
これにより、推論のためにループモデルの新しいアプリケーションが解除されます。
第一に、追加、$ p $ -hop誘導、数学の問題など、多くの合成推論の問題について、$ k $ -layerの変圧器がループした$ l $タイムは、$ kl $ -layerのパフォーマンスにほぼ一致していることを示しています。
モデル、および$ k $ -layerモデルよりも大幅に優れています。
これは、このような推論の問題が反復アルゴリズムを介して解決できることを示す理論的結果によってさらに裏付けられており、したがって、ほぼ最適な深さのループモデルを使用して効果的に解決できます。
おそらく驚くべきことに、これらの利点は言語モデリングの実用的な設定にも変換されます。多くの下流の推論タスクでは、$ k $ -Layersがループした言語モデルは、$ kl $ – よりも優れていても競争力があります。
レイヤー言語モデル。
実際、私たちの経験的分析は、興味をそそる現象を明らかにしています。ループされたモデルと非ループされたモデルは、考え方（COT）の推論の推論時のスケーリングに似た、効果的な深さに依存するスケーリング動作を示します。
さらに、ループモデルが潜在的な思考を暗黙的に生成し、$ t $ループで$ t $ステップをシミュレートできることを証明することにより、COT推論への接続を解明します。
これらの発見に触発されて、私たちはまた、推論と暗記の間の興味深い二分法を提示し、両方の面で効果的なループベースの正則化を設計します。

要約(オリジナル)

Large language models have shown remarkable reasoning abilities and scaling laws suggest that large parameter count, especially along the depth axis, is the primary driver. In this work, we make a stronger claim — many reasoning problems require a large depth but not necessarily many parameters. This unlocks a novel application of looped models for reasoning. Firstly, we show that for many synthetic reasoning problems like addition, $p$-hop induction, and math problems, a $k$-layer transformer looped $L$ times nearly matches the performance of a $kL$-layer non-looped model, and is significantly better than a $k$-layer model. This is further corroborated by theoretical results showing that many such reasoning problems can be solved via iterative algorithms, and thus, can be solved effectively using looped models with nearly optimal depth. Perhaps surprisingly, these benefits also translate to practical settings of language modeling — on many downstream reasoning tasks, a language model with $k$-layers looped $L$ times can be competitive to, if not better than, a $kL$-layer language model. In fact, our empirical analysis reveals an intriguing phenomenon: looped and non-looped models exhibit scaling behavior that depends on their effective depth, akin to the inference-time scaling of chain-of-thought (CoT) reasoning. We further elucidate the connection to CoT reasoning by proving that looped models implicitly generate latent thoughts and can simulate $T$ steps of CoT with $T$ loops. Inspired by these findings, we also present an interesting dichotomy between reasoning and memorization, and design a looping-based regularization that is effective on both fronts.

arxiv情報

著者	Nikunj Saunshi,Nishanth Dikkala,Zhiyuan Li,Sanjiv Kumar,Sashank J. Reddi
発行日	2025-02-24 18:49:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント