jarxiv | Japanese arxiv | ページ 594

Disjunctive Branch-And-Bound for Certifiably Optimal Low-Rank Matrix Completion

投稿日: 2025年5月8日作成者: jarxiv

要約

低ランクマトリックス完了は、可能な限り正確に一連の観測セットを回復する最小限の複雑さのマトリックスを計算することで構成されています。
残念ながら、マトリックス完了の既存の方法はヒューリスティックであり、高度にスケーラブルで高品質のソリューションを識別しますが、最適性の保証はありません。
最適性指向の目でマトリックスの完了を再検討します。
低ランクマトリックス完了の問題を、投影マトリックスの非凸セットに凸型の問題として再定式化し、それらを実証可能な最適性に解決する分岐枝とバウンドスキームを実装します。
さらに、低ランクのマトリックスをランク1つのマトリックスの合計として分解し、各ランク1マトリックスの2回の未成年者が決定要因ゼロになることを奨励することにより、凸状の凸様式の弛緩クラスを導き出します。
数値実験では、新しい凸緩和緩和により、既存の試みと比較して最適性のギャップが2桁減少し、$ n \ times m $ rank-$ r $ matrix完了の問題を$ \ max \ {m、n \ leq 2500 $ $ r \ {m、n
さらに、トレーニングエラーのこの改善は、平均$ 2 \％$ – $ 50 \％$のテストセットエラーの改善につながります。

要約(オリジナル)

Low-rank matrix completion consists of computing a matrix of minimal complexity that recovers a given set of observations as accurately as possible. Unfortunately, existing methods for matrix completion are heuristics that, while highly scalable and often identifying high-quality solutions, do not possess any optimality guarantees. We reexamine matrix completion with an optimality-oriented eye. We reformulate low-rank matrix completion problems as convex problems over the non-convex set of projection matrices and implement a disjunctive branch-and-bound scheme that solves them to certifiable optimality. Further, we derive a novel and often near-exact class of convex relaxations by decomposing a low-rank matrix as a sum of rank-one matrices and incentivizing that two-by-two minors in each rank-one matrix have determinant zero. In numerical experiments, our new convex relaxations decrease the optimality gap by two orders of magnitude compared to existing attempts, and our disjunctive branch-and-bound scheme solves $n \times m$ rank-$r$ matrix completion problems to certifiable optimality or near optimality in hours for $\max \{m, n\} \leq 2500$ and $r \leq 5$. Moreover, this improvement in the training error translates into an average $2\%$–$50\%$ improvement in the test set error.

arxiv情報

著者	Dimitris Bertsimas,Ryan Cory-Wright,Sean Lo,Jean Pauphilet
発行日	2025-05-07 15:40:10+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, math.OC, stat.ML | コメントを受け付けていません

Communication-Efficient Federated Fine-Tuning of Language Models via Dynamic Update Schedules

投稿日: 2025年5月8日作成者: jarxiv

要約

Federated Learning（FL）により、そうでなければ未開発でアクセスできないデータに関するモデルをトレーニングできます。
同時に、事前に訓練された言語モデル（LMS）は、現代のワークフローに不可欠なツールとして浮上しています。
これらのモデルは並外れた機能を示し、下流のタスクに簡単に適応できます。
これにより、フロリダ州で最もエキサイティングなフロンティアの1つであるLMSが微調整されます。
ただし、FLでの持続的な課題は、パラメーターの頻繁で厳格な通信であり、これらの最新モデルの膨大なサイズによって拡大される問題です。
現在、Fedoptファミリーのアルゴリズムは、FLの一般的なアプローチですが、モデル同期のための固定されたヒューリスティック間隔に依存しています。
最近、FDAアルゴリズムは、トレーニングの進捗状況を監視することにより動的な代替案を導入しましたが、独自の欠点がありました。
つまり、調整が難しいしきい値パラメーターと剛性の同期スキーム。
この作業では、FDA-OPTファミリーのアルゴリズムを紹介します。これは、FDAとFedoptの両方の背後にある原則を拡張する統一された一般化を紹介しながら、コアの制限を解決します。
さまざまな下流のNLPタスクにわたるLMSの微調整に関するアプローチを評価し、FDA-OPTが元々競合他社向けに最適化されたハイパーパラメーター設定で動作した場合でも、Fedoptを一貫して上回ることを実証します。
言い換えれば、FDA-OPTは、最新のFLライブラリとシステムのFEDOPTの実用的なドロップイン交換であることを示します。追加の構成は必要ありません。

要約(オリジナル)

Federated learning (FL) makes it possible to train models on data that would otherwise remain untapped and inaccessible. Simultaneously, pre-trained language models (LMs) have emerged as indispensable tools in modern workflows. These models exhibit extraordinary capabilities and are easily adapted to downstream tasks. This opens one of the most exciting frontiers in FL: fine-tuning LMs. However, a persistent challenge in FL is the frequent, rigid communication of parameters, a problem which is magnified by the sheer size of these modern models. Currently, the FedOpt family of algorithms is the prevailing approach in FL, though it relies on fixed, heuristic intervals for model synchronization. Recently, the FDA algorithm introduced a dynamic alternative by monitoring training progress, but it came with its own drawbacks; namely, a hard-to-tune threshold parameter and a rigid synchronization scheme. In this work, we introduce the FDA-Opt family of algorithms — a unified generalization that extends the principles behind both FDA and FedOpt, while resolving their core limitations. We evaluate our approach on fine-tuning LMs across a range of downstream NLP tasks, and demonstrate that it consistently outperforms FedOpt — even when FDA-Opt operates under hyper-parameter settings originally optimized for its competitors. In other words, we show that FDA-Opt is a practical, drop-in replacement for FedOpt in modern FL libraries and systems: it requires no additional configuration and delivers superior performance out of the box.

arxiv情報

著者	Michail Theologitis,Vasilis Samoladas,Antonios Deligiannakis
発行日	2025-05-07 16:13:21+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via $α$-$β$-Divergence

投稿日: 2025年5月8日作成者: jarxiv

要約

知識蒸留（KD）は、通常、フォワードカルバック – イブラー発散（FKLD）またはリバースKLD（RKLD）を使用して、出力分布間の発散を最小限に抑えることにより、大規模な教師モデルから小規模な学生モデルに知識を転送します。
1ホットのラベルと比較して、教師の分布によって提供されるより広範な監督情報のため、効果的なトレーニングパラダイムになりました。
KDのコアチャレンジは、2つのモード濃度効果のバランスをとることにあります。
勾配の更新中に確率がどのように再割り当てされるかを分析することにより、これら2つの効果がFKLDとRKLDで絡み合っているが、極端な形で絡み合っていることがわかります。
具体的には、両方ともFKLDが弱すぎるため、学生はターゲットクラスに集中できません。
対照的に、両方ともRKLDで強すぎるため、教師からのより広い分布情報を無視しながら、生徒はターゲットクラスを過度に強調します。
この不均衡に対処するために、$ \ alpha $ \ beta $ -divergenceを備えた一般的なフレームワークであるABKDを提案します。
私たちの理論的結果は、ABKDがFKLDとRKLDの間のスムーズな補間を提供し、これらの効果の間で効果的なトレードオフを達成することを示しています。
12の教師と学生の設定を備えた17の言語/ビジョンデータセットでの広範な実験は、その有効性を確認します。
このコードは、https：//github.com/ghwang-s/abkdで入手できます。

要約(オリジナル)

Knowledge Distillation (KD) transfers knowledge from a large teacher model to a smaller student model by minimizing the divergence between their output distributions, typically using forward Kullback-Leibler divergence (FKLD) or reverse KLD (RKLD). It has become an effective training paradigm due to the broader supervision information provided by the teacher distribution compared to one-hot labels. We identify that the core challenge in KD lies in balancing two mode-concentration effects: the \textbf{\textit{Hardness-Concentration}} effect, which refers to focusing on modes with large errors, and the \textbf{\textit{Confidence-Concentration}} effect, which refers to focusing on modes with high student confidence. Through an analysis of how probabilities are reassigned during gradient updates, we observe that these two effects are entangled in FKLD and RKLD, but in extreme forms. Specifically, both are too weak in FKLD, causing the student to fail to concentrate on the target class. In contrast, both are too strong in RKLD, causing the student to overly emphasize the target class while ignoring the broader distributional information from the teacher. To address this imbalance, we propose ABKD, a generic framework with $\alpha$-$\beta$-divergence. Our theoretical results show that ABKD offers a smooth interpolation between FKLD and RKLD, achieving an effective trade-off between these effects. Extensive experiments on 17 language/vision datasets with 12 teacher-student settings confirm its efficacy. The code is available at https://github.com/ghwang-s/abkd.

arxiv情報

著者	Guanghui Wang,Zhiyong Yang,Zitai Wang,Shi Wang,Qianqian Xu,Qingming Huang
発行日	2025-05-07 16:48:49+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Smooth Quadratic Prediction Markets

投稿日: 2025年5月8日作成者: jarxiv

要約

エージェントが二重性ベースのコスト関数予測市場で取引するとき、彼らは正規化されたリーダーの次の学習アルゴリズムを集合的に実装します。
他の学習アルゴリズムを使用して、予測市場の設計を刺激できるかどうかを尋ねます。
二重性ベースのコスト関数マーケットメーカー（DCFMM）価格設定メカニズムを分解して変更することにより、スムーズな二次予測市場と呼ばれる新しい予測市場を提案します。これは、エージェントが一般的な急な勾配降下を集合的に実装するように促します。
DCFMMと比較して、滑らかな二次予測市場は、瞬間的な価格、情報の組み込み、表現力、仲裁なし、インセンティブの適合性の存在などの公理保証を維持しながら、広告証券の最悪の金銭的損失が向上しています。
スムーズな2次予測市場の適用を動機付けるために、2つの現実的な制約の下でエージェントの取引行動を独立して検討します。
最後に、スムーズな二次予測市場を使用して適応性のある流動性を促進するためのアプローチの入門分析を提供します。
私たちの結果は、価格更新ルールが料金構造とは別の将来の設計を示唆していますが、保証は保持されています。

要約(オリジナル)

When agents trade in a Duality-based Cost Function prediction market, they collectively implement the learning algorithm Follow-The-Regularized-Leader. We ask whether other learning algorithms could be used to inspire the design of prediction markets. By decomposing and modifying the Duality-based Cost Function Market Maker’s (DCFMM) pricing mechanism, we propose a new prediction market, called the Smooth Quadratic Prediction Market, the incentivizes agents to collectively implement general steepest gradient descent. Relative to the DCFMM, the Smooth Quadratic Prediction Market has a better worst-case monetary loss for AD securities while preserving axiom guarantees such as the existence of instantaneous price, information incorporation, expressiveness, no arbitrage, and a form of incentive compatibility. To motivate the application of the Smooth Quadratic Prediction Market, we independently examine agents’ trading behavior under two realistic constraints: bounded budgets and buy-only securities. Finally, we provide an introductory analysis of an approach to facilitate adaptive liquidity using the Smooth Quadratic Prediction Market. Our results suggest future designs where the price update rule is separate from the fee structure, yet guarantees are preserved.

arxiv情報

著者	Enrique Nueve,Bo Waggoner
発行日	2025-05-07 16:53:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.GT, cs.LG | コメントを受け付けていません

Contaminated Multivariate Time-Series Anomaly Detection with Spatio-Temporal Graph Conditional Diffusion Models

投稿日: 2025年5月8日作成者: jarxiv

要約

主流の監視なしの異常検出アルゴリズムは、しばしば学術データセットで優れていますが、クリーントレーニングデータを含む制御された実験条件により、実際のパフォーマンスが制限されています。
実際の異常検出における一般的な問題であるノイズによるトレーニングの課題に対処することは、しばしば見落とされます。
先駆的な努力では、この研究は、感覚の時系列異常検出（TSAD）内のラベルレベルのノイズの領域を掘り下げています。
このペーパーでは、トレーニングデータが異常で汚染されている場合、斬新で実用的なエンドツーエンドの監視されていないTSADを紹介します。
TSAD-Cと呼ばれる導入されたアプローチには、トレーニングフェーズ中に異常ラベルへのアクセスがありません。
TSAD-Cには、3つのコアモジュールが含まれます。トレーニング中に存在する異常（別名ノイズ）を是正するための脱染色剤、長期変数依存モデリングモジュールであり、純粋なデータからの提出と見なされる除染されたデータ内の長期的な変数および相互依存性依存関係をキャプチャし、アノマーを検出する態度スケーリングの装飾を検出します。
4つの信頼性が高く多様なデータセットで実施された広範な実験は、TSAD-Cが既存の方法論を上回り、TSAD分野で新しい最先端を確立することを決定的に実証しています。

要約(オリジナル)

Mainstream unsupervised anomaly detection algorithms often excel in academic datasets, yet their real-world performance is restricted due to the controlled experimental conditions involving clean training data. Addressing the challenge of training with noise, a prevalent issue in practical anomaly detection, is frequently overlooked. In a pioneering endeavor, this study delves into the realm of label-level noise within sensory time-series anomaly detection (TSAD). This paper presents a novel and practical end-to-end unsupervised TSAD when the training data is contaminated with anomalies. The introduced approach, called TSAD-C, is devoid of access to abnormality labels during the training phase. TSAD-C encompasses three core modules: a Decontaminator to rectify anomalies (aka noise) present during training, a Long-range Variable Dependency Modeling module to capture long-term intra- and inter-variable dependencies within the decontaminated data that is considered as a surrogate of the pure normal data, and an Anomaly Scoring module to detect anomalies from all types. Our extensive experiments conducted on four reliable and diverse datasets conclusively demonstrate that TSAD-C surpasses existing methodologies, thus establishing a new state-of-the-art in the TSAD field.

arxiv情報

著者	Thi Kieu Khanh Ho,Narges Armanfard
発行日	2025-05-07 16:56:51+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, eess.SP | コメントを受け付けていません

Multitask LSTM for Arboviral Outbreak Prediction Using Public Health Data

投稿日: 2025年5月8日作成者: jarxiv

要約

このホワイトペーパーでは、ブラジルのレシフェでのデング熱、チクングニヤ、ジカのアルボウイルス発生と症例数の共同予測のための長期記憶（LSTM）ネットワークに基づくマルチタスク学習アプローチを紹介します。
Datasus（2017-2023）の過去の公衆衛生データを活用すると、提案されたモデルはバイナリ分類（アウトブレイク検出）と回帰（ケース予測）タスクを同時に実行します。
スライディングウィンドウ戦略が、さまざまな入力長（60、90、および120日）を使用して時間的機能を構築するために採用され、Kerasチューナーを使用してハイパーパラメーターの最適化が実施されました。
モデル評価では、堅牢性のために時系列の交差検証を使用し、一般化評価のために2023年からの保留テストを使用しました。
結果は、より長い窓がデング熱の回帰精度を改善し、分類パフォーマンスが中間窓でピークに達し、シーケンスの長さと一般化の間の最適なトレードオフを示唆していることを示しています。
マルチタスクアーキテクチャは、病気やタスク間で競争力のあるパフォーマンスを提供し、データ制限された公衆衛生シナリオにおけるスケーラブルな流行予測のための統一されたモデリング戦略の実現可能性と利点を実証します。

要約(オリジナル)

This paper presents a multitask learning approach based on long-short-term memory (LSTM) networks for the joint prediction of arboviral outbreaks and case counts of dengue, chikungunya, and Zika in Recife, Brazil. Leveraging historical public health data from DataSUS (2017-2023), the proposed model concurrently performs binary classification (outbreak detection) and regression (case forecasting) tasks. A sliding window strategy was adopted to construct temporal features using varying input lengths (60, 90, and 120 days), with hyperparameter optimization carried out using Keras Tuner. Model evaluation used time series cross-validation for robustness and a held-out test from 2023 for generalization assessment. The results show that longer windows improve dengue regression accuracy, while classification performance peaked at intermediate windows, suggesting an optimal trade-off between sequence length and generalization. The multitask architecture delivers competitive performance across diseases and tasks, demonstrating the feasibility and advantages of unified modeling strategies for scalable epidemic forecasting in data-limited public health scenarios.

arxiv情報

著者	Lucas R. C. Farias,Talita P. Silva,Pedro H. M. Araujo
発行日	2025-05-07 16:58:18+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Federated Generalised Variational Inference: A Robust Probabilistic Federated Learning Framework

投稿日: 2025年5月8日作成者: jarxiv

要約

以前と尤度の誤りの両方に堅牢な確率的フェデレートラーニング（FL）フレームワークであるFedGVIを紹介します。
FedGVIは、較正された不確実性の定量化により、モデルの誤りの下で公平な予測を提供することにより、頻繁なフロリストとベイジアンFLの両方の制限に対処します。
私たちのアプローチは、以前のFLアプローチ、特に分割された変動推論（Ashman et al。、2022）を一般化し、堅牢でコンジュゲートの更新を許可し、クライアントの計算の複雑さを減らします。
固定点の収束、キャビティ分布の最適性、および尤度誤解に対する証明可能な堅牢性の観点から理論的分析を提供します。
さらに、複数の合成および現実世界分類データセットの堅牢性と予測パフォーマンスの改善の観点から、FEDGVIの有効性を経験的に実証します。

要約(オリジナル)

We introduce FedGVI, a probabilistic Federated Learning (FL) framework that is robust to both prior and likelihood misspecification. FedGVI addresses limitations in both frequentist and Bayesian FL by providing unbiased predictions under model misspecification, with calibrated uncertainty quantification. Our approach generalises previous FL approaches, specifically Partitioned Variational Inference (Ashman et al., 2022), by allowing robust and conjugate updates, decreasing computational complexity at the clients. We offer theoretical analysis in terms of fixed-point convergence, optimality of the cavity distribution, and provable robustness to likelihood misspecification. Further, we empirically demonstrate the effectiveness of FedGVI in terms of improved robustness and predictive performance on multiple synthetic and real world classification data sets.

arxiv情報

著者	Terje Mildner,Oliver Hamelijnck,Paris Giampouras,Theodoros Damoulas
発行日	2025-05-07 17:06:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, stat.ML | コメントを受け付けていません

Hard-Negative Sampling for Contrastive Learning: Optimal Representation Geometry and Neural- vs Dimensional-Collapse

投稿日: 2025年5月8日作成者: jarxiv

要約

広く研究されているデータモデルと一般的な損失およびサンプル硬化機能の場合、監視されたコントラスト学習（SCL）、ハードSCL（HSCL）、および監視されていないコントラスト学習（UCL）の喪失が、ニューラルコラプス（NC）、つまり、クラスが同じクラスの形式を形成することを意味する表現によって最小化されることを最小限に抑えます。
また、あらゆる表現マッピングについて、HSCLおよびHARD-UCL（HUCL）損失が対応するSCLおよびUCL損失によって低下することを証明します。
既存の文献とは対照的に、SCLの理論的結果は、広く使用されているInfonce損失関数を含む一般的な損失関数クラスの拡張ビューのクラス条件付き独立性を必要としません。
さらに、私たちの証明はよりシンプルで、コンパクトで、透明です。
既存の文献と同様に、私たちの理論的主張は、バッチが最適化に使用される実際のシナリオについても保持しています。
ランダムな初期化と適切な硬度レベルを備えたHSCLおよびHUCL損失のAdamの最適化（バッチを使用して）を初めて実証し、ユニットボールまたはユニット球体特徴の正規化を組み込むと、実際にNCジオメトリに収束する可能性があります。
ただし、硬い陰謀や特徴の正規化を組み込むことなく、Adamを介して学習した表現は、寸法収縮（DC）に苦しみ、NCジオメトリの達成に失敗します。
これらの結果は、対照的な表現学習におけるハードネガティブサンプリングの役割を例示しており、将来の仕事のためのいくつかのオープンな理論的問題で結論付けています。
コードはhttps://github.com/rjiang03/hcl/tree/mainにあります

要約(オリジナル)

For a widely-studied data model and general loss and sample-hardening functions we prove that the losses of Supervised Contrastive Learning (SCL), Hard-SCL (HSCL), and Unsupervised Contrastive Learning (UCL) are minimized by representations that exhibit Neural-Collapse (NC), i.e., the class means form an Equiangular Tight Frame (ETF) and data from the same class are mapped to the same representation. We also prove that for any representation mapping, the HSCL and Hard-UCL (HUCL) losses are lower bounded by the corresponding SCL and UCL losses. In contrast to existing literature, our theoretical results for SCL do not require class-conditional independence of augmented views and work for a general loss function class that includes the widely used InfoNCE loss function. Moreover, our proofs are simpler, compact, and transparent. Similar to existing literature, our theoretical claims also hold for the practical scenario where batching is used for optimization. We empirically demonstrate, for the first time, that Adam optimization (with batching) of HSCL and HUCL losses with random initialization and suitable hardness levels can indeed converge to the NC-geometry if we incorporate unit-ball or unit-sphere feature normalization. Without incorporating hard-negatives or feature normalization, however, the representations learned via Adam suffer from Dimensional-Collapse (DC) and fail to attain the NC-geometry. These results exemplify the role of hard-negative sampling in contrastive representation learning and we conclude with several open theoretical problems for future work. The code can be found at https://github.com/rjiang03/HCL/tree/main

arxiv情報

著者	Ruijie Jiang,Thuan Nguyen,Shuchin Aeron,Prakash Ishwar
発行日	2025-05-07 17:12:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

TransAxx: Efficient Transformers with Approximate Computing

投稿日: 2025年5月8日作成者: jarxiv

要約

変圧器アーキテクチャによって最近導入されたVision Transfransfransfranslansfranfer（VIT）モデルは、非常に競争力があり、多くの場合、畳み込みニューラルネットワーク（CNNS）の一般的な代替手段になることが示されています。
ただし、これらのモデルの高い計算要件は、特に低電力デバイスでの実際的な適用性を制限しています。
現在の最先端は、DNN加速器の非常に増加した計算需要に対処するためにおおよその乗数を雇用していますが、VITモデルでの使用を調査したことはありません。
この作業では、TransAxxを提案します。TransAxxは、近似算術のための高速な固有のサポートを可能にして、VITモデルなどのDNNSに対する近似コンピューティングの影響をシームレスに評価できるTransAxxを提案します。
TransAxxを使用して、Imagenet Dataset上のトランスモデルの感度を分析して、増殖を近似し、概算を認識した微調整を実行して精度を回復します。
さらに、VITモデルの近似アクセラレータを生成する方法を提案します。
私たちのアプローチでは、モンテカルロツリー検索（MCTS）アルゴリズムを使用して、ハードウェア駆動型の手作りポリシーを使用して、可能な構成のスペースを効率的に検索します。
私たちの評価は、精度と電力の間で重要なトレードオフを達成する際の方法論の有効性を示しており、その結果、パフォーマンスを損なうことなく大幅に利益を得ています。

要約(オリジナル)

Vision Transformer (ViT) models which were recently introduced by the transformer architecture have shown to be very competitive and often become a popular alternative to Convolutional Neural Networks (CNNs). However, the high computational requirements of these models limit their practical applicability especially on low-power devices. Current state-of-the-art employs approximate multipliers to address the highly increased compute demands of DNN accelerators but no prior research has explored their use on ViT models. In this work we propose TransAxx, a framework based on the popular PyTorch library that enables fast inherent support for approximate arithmetic to seamlessly evaluate the impact of approximate computing on DNNs such as ViT models. Using TransAxx we analyze the sensitivity of transformer models on the ImageNet dataset to approximate multiplications and perform approximate-aware finetuning to regain accuracy. Furthermore, we propose a methodology to generate approximate accelerators for ViT models. Our approach uses a Monte Carlo Tree Search (MCTS) algorithm to efficiently search the space of possible configurations using a hardware-driven hand-crafted policy. Our evaluation demonstrates the efficacy of our methodology in achieving significant trade-offs between accuracy and power, resulting in substantial gains without compromising on performance.

arxiv情報

著者	Dimitrios Danopoulos,Georgios Zervakis,Dimitrios Soudris,Jörg Henkel
発行日	2025-05-07 17:13:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AR, cs.LG | コメントを受け付けていません

Implicitly Aligning Humans and Autonomous Agents through Shared Task Abstractions

投稿日: 2025年5月8日作成者: jarxiv

要約

共同作業では、自律的なエージェントは、新しいチームメイトや馴染みのないチームメイトに迅速に適応する能力において人間に不足しています。
ゼロショット調整の制限要因は、人間がチームメイトと暗黙的に整合するために依存するメカニズムである共有タスクの抽象化の欠如であると仮定します。
このギャップに対処するために、ha $^2 $：階層アドホックエージェントを紹介します。これは、階層補強学習を活用して、人間が共同作業で使用する構造化されたアプローチを模倣するためのフレバレットです。
過熱した環境でha $^2 $を評価し、目に見えないエージェントと人間の両方とペアになった場合、既存のベースラインよりも統計的に有意な改善を示し、環境シフトのより良い回復力を提供し、すべての最先端の方法を上回ります。

要約(オリジナル)

In collaborative tasks, autonomous agents fall short of humans in their capability to quickly adapt to new and unfamiliar teammates. We posit that a limiting factor for zero-shot coordination is the lack of shared task abstractions, a mechanism humans rely on to implicitly align with teammates. To address this gap, we introduce HA$^2$: Hierarchical Ad Hoc Agents, a framework leveraging hierarchical reinforcement learning to mimic the structured approach humans use in collaboration. We evaluate HA$^2$ in the Overcooked environment, demonstrating statistically significant improvement over existing baselines when paired with both unseen agents and humans, providing better resilience to environmental shifts, and outperforming all state-of-the-art methods.

arxiv情報

著者	Stéphane Aroca-Ouellette,Miguel Aroca-Ouellette,Katharina von der Wense,Alessandro Roncone
発行日	2025-05-07 17:19:17+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, cs.MA | コメントを受け付けていません

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント