jarxiv | Japanese arxiv | ページ 397

UrbanMind: Urban Dynamics Prediction with Multifaceted Spatial-Temporal Large Language Models

投稿日: 2025年5月22日作成者: jarxiv

要約

都市のダイナミクスを理解し、予測することは、輸送システムの管理、都市計画の最適化、公共サービスの強化に不可欠です。
ニューラルネットワークベースのアプローチは成功を収めていますが、多くの場合、タスク固有のアーキテクチャと大量のデータに依存しており、多様な都市のシナリオ全体で一般化する能力を制限しています。
一方、大規模な言語モデル（LLMS）は強力な推論と一般化能力を提供しますが、空間的都市のダイナミクスへの適用は依存していないままです。
既存のLLMベースの方法は、多面的な空間的データを効果的に統合するのに苦労し、トレーニングとテストデータの間の分布シフトに対処し、実際のアプリケーションでの予測信頼性を制限します。
このギャップを埋めるために、正確な予測と堅牢な一般化の両方を保証する多面的な都市ダイナミクス予測のための新しい空間的LLMフレームワークであるUrbanmindを提案します。
その中心で、Urbanmindは、多面的な空間的依存関係と多面的な都市ダイナミクス間の相互相関をキャプチャする特殊なマスキング戦略を備えた多面的な融合マスク自動エンコーダーであるMuffin-Maeを導入します。
さらに、空間的に対応するコンテキストの詳細をプロンプトにコードするセマンティックに対応するプロンプトと微調整戦略を設計し、空間的パターン上で推論するLLMSの能力を高めます。
一般化をさらに向上させるために、テストデータ再構築装置を使用したテスト時間適応メカニズムを導入し、LLMで生成された埋め込みを再構築することにより、Urbanmindが目に見えないテストデータに動的に調整できるようにします。
複数の都市の実世界の都市データセットに関する広範な実験は、都市部が一貫して最先端のベースラインを上回り、ゼロショット設定であっても、高精度と堅牢な一般化を達成することを示しています。

要約(オリジナル)

Understanding and predicting urban dynamics is crucial for managing transportation systems, optimizing urban planning, and enhancing public services. While neural network-based approaches have achieved success, they often rely on task-specific architectures and large volumes of data, limiting their ability to generalize across diverse urban scenarios. Meanwhile, Large Language Models (LLMs) offer strong reasoning and generalization capabilities, yet their application to spatial-temporal urban dynamics remains underexplored. Existing LLM-based methods struggle to effectively integrate multifaceted spatial-temporal data and fail to address distributional shifts between training and testing data, limiting their predictive reliability in real-world applications. To bridge this gap, we propose UrbanMind, a novel spatial-temporal LLM framework for multifaceted urban dynamics prediction that ensures both accurate forecasting and robust generalization. At its core, UrbanMind introduces Muffin-MAE, a multifaceted fusion masked autoencoder with specialized masking strategies that capture intricate spatial-temporal dependencies and intercorrelations among multifaceted urban dynamics. Additionally, we design a semantic-aware prompting and fine-tuning strategy that encodes spatial-temporal contextual details into prompts, enhancing LLMs’ ability to reason over spatial-temporal patterns. To further improve generalization, we introduce a test time adaptation mechanism with a test data reconstructor, enabling UrbanMind to dynamically adjust to unseen test data by reconstructing LLM-generated embeddings. Extensive experiments on real-world urban datasets across multiple cities demonstrate that UrbanMind consistently outperforms state-of-the-art baselines, achieving high accuracy and robust generalization, even in zero-shot settings.

arxiv情報

著者	Yuhang Liu,Yingxue Zhang,Xin Zhang,Ling Tian,Yanhua Li,Jun Luo
発行日	2025-05-21 16:56:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Projection-Based Correction for Enhancing Deep Inverse Networks

投稿日: 2025年5月22日作成者: jarxiv

要約

深い学習ベースのモデルは、不正な逆の問題を解決することに顕著な成功を示しています。
ただし、多くの人は、測定プロセスによって課される物理的制約を厳密に遵守できません。
この作業では、投影ベースの修正方法を導入して、フォワードモデルとの一貫性を確保することにより、深い反転ネットワークの推論を強化します。
具体的には、学習した再構成ネットワークからの初期推定を考えると、逆問題の有効なソリューション空間内にソリューションを制約する投影ステップを適用します。
回復モデルがよく訓練された深い逆ネットワークである場合、解決策を範囲空間とヌル空間コンポーネントに分解できることを理論的に実証します。ここで、投影ベースの修正がアイデンティティ変換に減少します。
広範なシミュレーションと実験は、提案された方法を検証し、多様な逆問題と深いネットワークアーキテクチャにわたる再構築精度の改善を示しています。

要約(オリジナル)

Deep learning-based models have demonstrated remarkable success in solving illposed inverse problems; however, many fail to strictly adhere to the physical constraints imposed by the measurement process. In this work, we introduce a projection-based correction method to enhance the inference of deep inverse networks by ensuring consistency with the forward model. Specifically, given an initial estimate from a learned reconstruction network, we apply a projection step that constrains the solution to lie within the valid solution space of the inverse problem. We theoretically demonstrate that if the recovery model is a well-trained deep inverse network, the solution can be decomposed into range-space and null-space components, where the projection-based correction reduces to an identity transformation. Extensive simulations and experiments validate the proposed method, demonstrating improved reconstruction accuracy across diverse inverse problems and deep network architectures.

arxiv情報

著者	Jorge Bacca
発行日	2025-05-21 17:28:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, physics.comp-ph | コメントを受け付けていません

Solving General-Utility Markov Decision Processes in the Single-Trial Regime with Online Planning

投稿日: 2025年5月22日作成者: jarxiv

要約

この作業では、単一裁判のレジームで、つまりエージェントのパフォーマンスが単一の軌跡に基づいて評価されたときに、無限のホリゾンの割引一般的なマルコフ決定プロセス（GUMDP）を解決するための最初のアプローチを貢献します。
まず、単一裁判体制における政策最適化に関するいくつかの基本的な結果を提供し、どのクラスのポリシーが最適性に十分であるかを調査し、問題を元の問題に相当する特定のMDPとして投げかけ、単一裁判体制における政策最適化の計算硬度を研究します。
第二に、オンライン計画手法、特にモンテカルロツリー検索アルゴリズムを活用して、単一裁判体制のGUMDPを解決する方法を示します。
第三に、関連するベースラインと比較して、アプローチの優れたパフォーマンスを示す実験結果を提供します。

要約(オリジナル)

In this work, we contribute the first approach to solve infinite-horizon discounted general-utility Markov decision processes (GUMDPs) in the single-trial regime, i.e., when the agent’s performance is evaluated based on a single trajectory. First, we provide some fundamental results regarding policy optimization in the single-trial regime, investigating which class of policies suffices for optimality, casting our problem as a particular MDP that is equivalent to our original problem, as well as studying the computational hardness of policy optimization in the single-trial regime. Second, we show how we can leverage online planning techniques, in particular a Monte-Carlo tree search algorithm, to solve GUMDPs in the single-trial regime. Third, we provide experimental results showcasing the superior performance of our approach in comparison to relevant baselines.

arxiv情報

著者	Pedro P. Santos,Alberto Sardinha,Francisco S. Melo
発行日	2025-05-21 17:32:23+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

A Generative Diffusion Model to Solve Inverse Problems for Robust in-NICU Neonatal MRI

投稿日: 2025年5月22日作成者: jarxiv

要約

新生児集中治療室（NICU）の磁気共鳴イメージング（MRI）の最初の獲得 – 抗議拡散生成モデルを提示して、スキャン時間を短縮し、動きの堅牢性を改善するための逆の範囲を解決します。
In-Nicu MRIスキャナーは、初期のライブ開発の重要な段階での潜在的な脳の異常の非侵襲的評価のために、より低いフィールド強度（つまり、1.5 Tesla未満）で永久磁石を活用しますが、長いスキャン時間とモーションアーティファクトに悩まされます。
この設定では、トレーニングデータサイズは小さく、本質的に低い信号対雑音比（SNR）に苦しんでいます。
この作業は、いくつかの新規信号処理と機械学習方法を適用して低SNRと少量のデータを処理することにより、臨床新生児MRIのこのような現実世界のトレーニングデータセットを使用して、拡散確率生成モデルをトレーニングします。
その後、このモデルは、再訓練を必要とせずに、推論時間にさまざまな逆問題を解決する前に、統計画像として使用されます。
実験は、新生児MRIの3つの実際のアプリケーションの生成モデルの有用性を示しています：加速再構成、運動補正、および超解像度。

要約(オリジナル)

We present the first acquisition-agnostic diffusion generative model for Magnetic Resonance Imaging (MRI) in the neonatal intensive care unit (NICU) to solve a range of inverse problems for shortening scan time and improving motion robustness. In-NICU MRI scanners leverage permanent magnets at lower field-strengths (i.e., below 1.5 Tesla) for non-invasive assessment of potential brain abnormalities during the critical phase of early live development, but suffer from long scan times and motion artifacts. In this setting, training data sizes are small and intrinsically suffer from low signal-to-noise ratio (SNR). This work trains a diffusion probabilistic generative model using such a real-world training dataset of clinical neonatal MRI by applying several novel signal processing and machine learning methods to handle the low SNR and low quantity of data. The model is then used as a statistical image prior to solve various inverse problems at inference time without requiring any retraining. Experiments demonstrate the generative model’s utility for three real-world applications of neonatal MRI: accelerated reconstruction, motion correction, and super-resolution.

arxiv情報

著者	Yamin Arefeen,Brett Levac,Jonathan I. Tamir
発行日	2025-05-21 17:36:11+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, eess.IV, physics.med-ph | コメントを受け付けていません

Fair Supervised Learning Through Constraints on Smooth Nonconvex Unfairness-Measure Surrogates

投稿日: 2025年5月22日作成者: jarxiv

要約

公正な監視された機械学習のための新しい戦略が提案されています。
文献の他の戦略と比較した提案された戦略の主な利点は次のとおりです。
（a）不連続な不公平測定に関与する重い機能を近似するために、新しい滑らかな非凸の代理を導入します。
代理は、最適化の文献からの平滑化方法に基づいており、公正な監督された学習文献にとっては新しいものです。
サロゲートは、実際に公正な予測モデルにつながることができない他の（例えば凸）サロゲートとは対照的に、訓練された予測モデルが公正であることを保証する緊密な近似です。
（b）正規化（解決が困難な最適化の問題につながる）と対応する正規化パラメーター（調整が費用がかかる可能性がある）に依存するのではなく、不公平の特定の許容範囲を正規化の使用に関連する合併症なしに強制することができるように、ハード制約を使用する戦略を提案します。
（c）〜私たちの提案された戦略により、複数の（潜在的に矛盾する）不公平測定の制約が同時に容易に可能になります。
正規化アプローチでは複数の測定値を考慮することができますが、解決するのがさらに困難な最適化の問題が発生し、チューニングにさらなる費用がかかるという犠牲を払うことができます。
対照的に、ハード制約を通じて、私たちの戦略は、最小限のチューニングで扱いにくい最適化モデルにつながります。

要約(オリジナル)

A new strategy for fair supervised machine learning is proposed. The main advantages of the proposed strategy as compared to others in the literature are as follows. (a) We introduce a new smooth nonconvex surrogate to approximate the Heaviside functions involved in discontinuous unfairness measures. The surrogate is based on smoothing methods from the optimization literature, and is new for the fair supervised learning literature. The surrogate is a tight approximation which ensures the trained prediction models are fair, as opposed to other (e.g., convex) surrogates that can fail to lead to a fair prediction model in practice. (b) Rather than rely on regularizers (that lead to optimization problems that are difficult to solve) and corresponding regularization parameters (that can be expensive to tune), we propose a strategy that employs hard constraints so that specific tolerances for unfairness can be enforced without the complications associated with the use of regularization. (c)~Our proposed strategy readily allows for constraints on multiple (potentially conflicting) unfairness measures at the same time. Multiple measures can be considered with a regularization approach, but at the cost of having even more difficult optimization problems to solve and further expense for tuning. By contrast, through hard constraints, our strategy leads to optimization models that can be solved tractably with minimal tuning.

arxiv情報

著者	Zahra Khatti,Daniel P. Robinson,Frank E. Curtis
発行日	2025-05-21 17:41:06+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, math.OC | コメントを受け付けていません

HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving

投稿日: 2025年5月22日作成者: jarxiv

要約

大規模な言語モデル（LLMS）と強化学習（RL）を統合すると、複雑なシナリオで自律運転（AD）パフォーマンスを向上させることができます。
ただし、現在のLLMが支配しているRLメソッドは、LLM出力を超えていますが、幻覚が起こりやすいことが示されています。最先端のLLMは、必須の運転関連のタスクで評価された場合、約57.95％の非ホリューゼーション率を示していることを示しています。
したがって、これらの方法では、LLMからの幻覚は、運転政策のパフォーマンスを直接危険にさらす可能性があります。
この論文は、LLMとRLの間の相対的な独立性を維持することが幻覚の問題を解決するために不可欠であると主張しています。
その結果、この論文は、新しいLLMヒントRLパラダイムを提案することに専念しています。
LLMは、RLエージェントがモーションプランニングにおいてRLエージェントを支援するために、州の増強とポリシーの最適化のセマンティックヒントを生成するために使用されますが、RLエージェントは、優れた運転性能を達成するためのポリシー学習を通じて潜在的な誤ったセマンティック表示に対抗します。
このパラダイムに基づいて、HCRMP（LLMヒントのコンテキスト強化学習モーションプランナー）アーキテクチャを提案します。
コンテキスト安定性アンカーモジュールは、知識ベースからの情報を利用することにより、多粘膜重量ヒントの信頼性を高めます。
セマンティックキャッシュモジュールは、LLM低周波ガイダンスをRL高周波制御とシームレスに統合するために採用されています。
Carlaでの広範な実験は、HCRMPの全体的な運転パフォーマンスの強力なものを検証します。
HCRMPは、さまざまな交通密度の多様な運転条件で最大80.3％のタスク成功率を達成します。
安全性が批判的な駆動条件下では、HCRMPは衝突率を11.4％大幅に削減し、複雑なシナリオでの運転性能を効果的に改善します。

要約(オリジナル)

Integrating Large Language Models (LLMs) with Reinforcement Learning (RL) can enhance autonomous driving (AD) performance in complex scenarios. However, current LLM-Dominated RL methods over-rely on LLM outputs, which are prone to hallucinations.Evaluations show that state-of-the-art LLM indicates a non-hallucination rate of only approximately 57.95% when assessed on essential driving-related tasks. Thus, in these methods, hallucinations from the LLM can directly jeopardize the performance of driving policies. This paper argues that maintaining relative independence between the LLM and the RL is vital for solving the hallucinations problem. Consequently, this paper is devoted to propose a novel LLM-Hinted RL paradigm. The LLM is used to generate semantic hints for state augmentation and policy optimization to assist RL agent in motion planning, while the RL agent counteracts potential erroneous semantic indications through policy learning to achieve excellent driving performance. Based on this paradigm, we propose the HCRMP (LLM-Hinted Contextual Reinforcement Learning Motion Planner) architecture, which is designed that includes Augmented Semantic Representation Module to extend state space. Contextual Stability Anchor Module enhances the reliability of multi-critic weight hints by utilizing information from the knowledge base. Semantic Cache Module is employed to seamlessly integrate LLM low-frequency guidance with RL high-frequency control. Extensive experiments in CARLA validate HCRMP’s strong overall driving performance. HCRMP achieves a task success rate of up to 80.3% under diverse driving conditions with different traffic densities. Under safety-critical driving conditions, HCRMP significantly reduces the collision rate by 11.4%, which effectively improves the driving performance in complex scenarios.

arxiv情報

著者	Zhiwen Chen,Bo Leng,Zhuoren Li,Hanming Deng,Guizhe Jin,Ran Yu,Huanxi Wen
発行日	2025-05-21 17:47:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, cs.RO | コメントを受け付けていません

Model Merging is Secretly Certifiable: Non-Vacuous Generalisation Bounds for Low-Shot Learning

投稿日: 2025年5月22日作成者: jarxiv

要約

ディープネットワークのIID一般化能力の認証は、医学からセキュリティへのハイステークスアプリケーションでAIを信頼するための多くの要件の最初のものです。
ただし、深いネットワークの一般化境界をインスタンス化する場合、特にこのような高品質分野で一般的な小規模データに現代の大規模モデルを適用する場合、非vacuous保証を取得することは困難なままです。
このホワイトペーパーでは、モデルの融合証明書と一般化証明書に基づいた学習方法の家族との新しいつながりを描き、驚くべきことに、マイナーな調整により、いくつかの既存の学習戦略がすでに自明でない一般化保証を提供していることを示しています。
基本的に、微調整ではなく融合による下流タスクのデータ駆動型学習に焦点を当てることにより、認定された一般化ギャップはベースネットワークサイズとは独立しており、認証を促進します。
我々の結果は、VIT-Bなどの視覚モデルやMistral-7Bなどの言語モデルを使用しながら、100の例とともに低い例で学習するための非重要な一般化保証を初めて示しています。
この観察は、既存のシステムの認証を信頼できるものとして促進することに即座に意味を持ち、実践と理論の交差点で研究のための新しい方向性を開くため、重要です。

要約(オリジナル)

Certifying the IID generalisation ability of deep networks is the first of many requirements for trusting AI in high-stakes applications from medicine to security. However, when instantiating generalisation bounds for deep networks it remains challenging to obtain non-vacuous guarantees, especially when applying contemporary large models on the small scale data prevalent in such high-stakes fields. In this paper, we draw a novel connection between a family of learning methods based on model fusion and generalisation certificates, and surprisingly show that with minor adjustment several existing learning strategies already provide non-trivial generalisation guarantees. Essentially, by focusing on data-driven learning of downstream tasks by fusion rather than fine-tuning, the certified generalisation gap becomes tiny and independent of the base network size, facilitating its certification. Our results show for the first time non-trivial generalisation guarantees for learning with as low as 100 examples, while using vision models such as VIT-B and language models such as mistral-7B. This observation is significant as it has immediate implications for facilitating the certification of existing systems as trustworthy, and opens up new directions for research at the intersection of practice and theory.

arxiv情報

著者	Taehoon Kim,Henry Gouk,Minyoung Kim,Timothy Hospedales
発行日	2025-05-21 17:51:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

A Deep Learning Framework for Two-Dimensional, Multi-Frequency Propagation Factor Estimation

投稿日: 2025年5月22日作成者: jarxiv

要約

レーダー技術の効果的な展開には、海洋大気境界層内の複数の周波数にわたって屈折環境を正確に推定することが重要です。
従来の放物線方程式シミュレーションは、効果的ですが、計算上高価で時間型であり、実用的なアプリケーションを制限することができます。
このコミュニケーションは、深いニューラルネットワークを使用して、信号伝播に対する環境への影響を特徴付けるための重要なパラメーターであるパターン伝播係数を推定するための新しいアプローチを探ります。
修正された屈折率データを摂取し、同じドメインにわたってパターン伝播因子の予測を生成するように設計された画像間翻訳ジェネレーターが開発されました。
調査結果は、複数の周波数を分析し、従来の方法に代わるものを提供するパターン伝播係数を合理的に予測するために、深いニューラルネットワークをトレーニングできることを示しています。

要約(オリジナル)

Accurately estimating the refractive environment over multiple frequencies within the marine atmospheric boundary layer is crucial for the effective deployment of radar technologies. Traditional parabolic equation simulations, while effective, can be computationally expensive and time-intensive, limiting their practical application. This communication explores a novel approach using deep neural networks to estimate the pattern propagation factor, a critical parameter for characterizing environmental impacts on signal propagation. Image-to-image translation generators designed to ingest modified refractivity data and generate predictions of pattern propagation factors over the same domain were developed. Findings demonstrate that deep neural networks can be trained to analyze multiple frequencies and reasonably predict the pattern propagation factor, offering an alternative to traditional methods.

arxiv情報

著者	Sarah E. Wessinger,Leslie N. Smith,Jacob Gull,Jonathan Gehman,Zachary Beever,Andrew J. Kammerer
発行日	2025-05-21 17:56:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, eess.SP, physics.ao-ph | コメントを受け付けていません

Adaptive Estimation and Learning under Temporal Distribution Shift

投稿日: 2025年5月22日作成者: jarxiv

要約

この論文では、一時的な分布シフトの下での推定と学習の問題を研究します。
長さ$ n $の観測シーケンスを考えてみましょう。
私たちの焦点は、最終的なタイムステップでグラウンドトゥルースを推定する方法を開発しながら、鋭い点で推定されるエラー率を提供することです。
時間的シフトのレベルに関する事前知識がなければ、ウェーブレットソフトホールディング推定器は、グラウンドトースに最適な推定誤差を提供することを示します。
提案された推定方法は、シーケンスの非定常レベルとウェーブレット変換されたドメインのスパース性との関係を確立することにより、既存の研究をMazzettoとUpfal（2023）に一般化します。
理論的な調査結果は、数値実験によって検証されています。
さらに、推定器を適用して、分布シフトの下でのバイナリ分類のスパースアウェア過剰リスク境界を導き出し、計算効率の良いトレーニング目標を開発しました。
最終的な貢献として、私たちは結果と、そのようなタスクの新しい最適アルゴリズムを発見し、完全な変動除去の古典的な信号処理の問題（Mammen and Van de Geer、1997; Tibshirani、2014）との類似点を引き出します。

要約(オリジナル)

In this paper, we study the problem of estimation and learning under temporal distribution shift. Consider an observation sequence of length $n$, which is a noisy realization of a time-varying groundtruth sequence. Our focus is to develop methods to estimate the groundtruth at the final time-step while providing sharp point-wise estimation error rates. We show that, without prior knowledge on the level of temporal shift, a wavelet soft-thresholding estimator provides an optimal estimation error bound for the groundtruth. Our proposed estimation method generalizes existing researches Mazzetto and Upfal (2023) by establishing a connection between the sequence’s non-stationarity level and the sparsity in the wavelet-transformed domain. Our theoretical findings are validated by numerical experiments. Additionally, we applied the estimator to derive sparsity-aware excess risk bounds for binary classification under distribution shift and to develop computationally efficient training objectives. As a final contribution, we draw parallels between our results and the classical signal processing problem of total-variation denoising (Mammen and van de Geer,1997; Tibshirani, 2014), uncovering novel optimal algorithms for such task.

arxiv情報

著者	Dheeraj Baby,Yifei Tang,Hieu Duy Nguyen,Yu-Xiang Wang,Rohit Pyati
発行日	2025-05-21 17:56:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

On the creation of narrow AI: hierarchy and nonlocality of neural network skills

投稿日: 2025年5月22日作成者: jarxiv

要約

強力でありながら狭いAIシステムを作成する問題を研究しています。
最近のAIの進歩は、大規模な汎用基礎モデルのトレーニングによって推進されていますが、狭いドメインに特化した小さなモデルの作成は、効率と安全の両方に役立つ可能性があります。
この作業では、ニューラルネットワークがどのように表現を学び、構築するかの基本的な特性に関係して、そのようなシステムの作成に伴う2つの課題を探ります。
最初の課題は、狭いモデルをゼロから訓練することが可能な時期を告げます。
合成タスクに関する実験を通じて、その分布内で特定の狭いスキルを学ぶために、データの幅広い分布でネットワークをトレーニングする必要があることがわかります。
この効果は、スキルが階層的に互いに依存している場合に発生し、広範な分布でのトレーニングは、学習を大幅に加速するカリキュラムを導入します。
2番目の課題は、特定のスキルを大規模な一般的なモデルから小さな専門モデルに転送する方法についてです。
モデルスキルは、特定の一連のコンポーネントに完全にローカライズされていないことが多いことがわかります。
ただし、剪定に基づく方法は蒸留を上回る可能性があることがわかります。
不必要なスキルを学びながら、正規化目標の使用を調査して、希望するスキルを剪定可能なコンポーネントに合わせます。

要約(オリジナル)

We study the problem of creating strong, yet narrow, AI systems. While recent AI progress has been driven by the training of large general-purpose foundation models, the creation of smaller models specialized for narrow domains could be valuable for both efficiency and safety. In this work, we explore two challenges involved in creating such systems, having to do with basic properties of how neural networks learn and structure their representations. The first challenge regards when it is possible to train narrow models from scratch. Through experiments on a synthetic task, we find that it is sometimes necessary to train networks on a wide distribution of data to learn certain narrow skills within that distribution. This effect arises when skills depend on each other hierarchically, and training on a broad distribution introduces a curriculum which substantially accelerates learning. The second challenge regards how to transfer particular skills from large general models into small specialized models. We find that model skills are often not perfectly localized to a particular set of prunable components. However, we find that methods based on pruning can still outperform distillation. We investigate the use of a regularization objective to align desired skills with prunable components while unlearning unnecessary skills.

arxiv情報

著者	Eric J. Michaud,Asher Parker-Sartori,Max Tegmark
発行日	2025-05-21 17:59:21+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント