jarxiv | Japanese arxiv | ページ 1853

More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed Routing

投稿日: 2025年2月6日作成者: jarxiv

要約

生物学的ニューラルシステムの進化により、モジュール性とスパースコーディングの両方が生じ、これにより、寿命のタスクの多様性全体にわたってエネルギー効率と堅牢性が可能になります。
対照的に、標準的なニューラルネットワークは、密集した非専門的なアーキテクチャに依存しており、すべてのモデルパラメーターが同時に更新され、複数のタスクが学習され、干渉が発生します。
現在のまばらなニューラルネットワークアプローチは、この問題を軽減することを目的としていますが、1）表現崩壊を引き起こすトレーニング可能なゲーティング機能、2）冗長計算とゆっくりした学習をもたらす分離専門家、3）明示的な入力またはタスクIDへの依存などの制限によって妨げられています。
その制限の柔軟性とスケーラビリティ。
このペーパーでは、専門家の条件付き重複する混合物（COMET）を提案します。これは、指数関数的な数の重複する専門家を備えたモジュール式のまばらなアーキテクチャを誘導することにより、これらの課題に対処する一般的な深い学習方法です。
彗星は、専門家のまばらな混合物で使用されるトレーニング可能なゲーティング機能を、個々の入力表現に適用される固定された生物学的にインスパイアされたランダム投影に置き換えます。
この設計により、エキスパートの重複の程度が入力の類似性に依存するため、同様の入力がより多くのパラメーターを共有する傾向があります。
これにより、更新ステップごとに学習が速くなり、サンプル外の一般化が改善されます。
いくつかの一般的な深い学習アーキテクチャを使用して、画像分類、言語モデリング、回帰など、さまざまなタスクに対する彗星の有効性を実証します。

要約(オリジナル)

The evolution of biological neural systems has led to both modularity and sparse coding, which enables energy efficiency and robustness across the diversity of tasks in the lifespan. In contrast, standard neural networks rely on dense, non-specialized architectures, where all model parameters are simultaneously updated to learn multiple tasks, leading to interference. Current sparse neural network approaches aim to alleviate this issue but are hindered by limitations such as 1) trainable gating functions that cause representation collapse, 2) disjoint experts that result in redundant computation and slow learning, and 3) reliance on explicit input or task IDs that limit flexibility and scalability. In this paper we propose Conditionally Overlapping Mixture of ExperTs (COMET), a general deep learning method that addresses these challenges by inducing a modular, sparse architecture with an exponential number of overlapping experts. COMET replaces the trainable gating function used in Sparse Mixture of Experts with a fixed, biologically inspired random projection applied to individual input representations. This design causes the degree of expert overlap to depend on input similarity, so that similar inputs tend to share more parameters. This results in faster learning per update step and improved out-of-sample generalization. We demonstrate the effectiveness of COMET on a range of tasks, including image classification, language modeling, and regression, using several popular deep learning architectures.

arxiv情報

著者	Sagi Shaier,Francisco Pereira,Katharina von der Wense,Lawrence E Hunter,Matt Jones
発行日	2025-02-05 16:57:22+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Scaling laws in wearable human activity recognition

投稿日: 2025年2月6日作成者: jarxiv

要約

ウェアラブルマルチモーダルセンサーからの人間の活動認識（HAR）のために、多くの深いアーキテクチャと自己監督の事前訓練技術が提案されています。
スケーリング法は、モデル容量を事前トレーニングデータボリュームとリンクすることにより、より原則的な設計への移行を支援する可能性があります。
しかし、言語やビジョンと同じ程度まで、HARのスケーリング法は確立されていません。
トレーニング前のデータとトランスアーキテクチャの両方の量の両方で徹底的なグリッド検索を実施することにより、HARの最初の既知のスケーリング法則を確立します。
データセットの量とパラメーターカウントの量との電力法関係を備えたトレーニング前の損失スケールは、ユーザーあたりのデータを増やすよりもパフォーマンスが急激に改善されることを示しています。
データは重要であり、これは自己監視されたHARで以前に報告されたいくつかの発見とは対照的です。
これらのスケーリング法は、姿勢、運動モード、日常生活の活動の3つのHARベンチマークデータセットの下流のパフォーマンスの改善につながることを示しています。
最後に、より適切なモデル容量を備えたこれらのスケーリング法則に照らして、以前に公開された作品を再検討することをお勧めします。

要約(オリジナル)

Many deep architectures and self-supervised pre-training techniques have been proposed for human activity recognition (HAR) from wearable multimodal sensors. Scaling laws have the potential to help move towards more principled design by linking model capacity with pre-training data volume. Yet, scaling laws have not been established for HAR to the same extent as in language and vision. By conducting an exhaustive grid search on both amount of pre-training data and Transformer architectures, we establish the first known scaling laws for HAR. We show that pre-training loss scales with a power law relationship to amount of data and parameter count and that increasing the number of users in a dataset results in a steeper improvement in performance than increasing data per user, indicating that diversity of pre-training data is important, which contrasts to some previously reported findings in self-supervised HAR. We show that these scaling laws translate to downstream performance improvements on three HAR benchmark datasets of postures, modes of locomotion and activities of daily living: UCI HAR and WISDM Phone and WISDM Watch. Finally, we suggest some previously published works should be revisited in light of these scaling laws with more adequate model capacities.

arxiv情報

著者	Tom Hoddes,Alex Bijamov,Saket Joshi,Daniel Roggen,Ali Etemad,Robert Harle,David Racz
発行日	2025-02-05 17:00:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

A Match Made in Heaven? Matching Test Cases and Vulnerabilities With the VUTECO Approach

投稿日: 2025年2月6日作成者: jarxiv

要約

ソフトウェアの脆弱性は、静的分析、浸透テスト、およびファジングによって一般的に検出されます。
また、作成された入力でセキュリティに敏感な動作を刺激するユニットテスト（いわゆる脆弱性監視テスト）を実行することで見つけることができます。
このようなテストの開発は困難で時間がかかります。
したがって、自動化されたデータ駆動型アプローチは、開発者がより早く脆弱性を傍受するのに役立つ可能性があります。
ただし、このようなアプローチのトレーニングと検証には、多くのデータが必要であり、現在は不足しています。
このペーパーでは、Javaリポジトリから脆弱性装備テストのインスタンスを収集するための深い学習ベースのアプローチであるVutecoを紹介します。
Vutecoは、2つのタスクを実行します。（1）テストケースがセキュリティ関連であるかどうかを判断する「検索」タスク、および（2）テストケースを目撃している正確な脆弱性に関連付ける「一致」タスク。
Vutecoは、VUL4Jで検証済みのテストケースで完全な精度と0.83 F0.5スコアを達成し、244のオープンソースJavaプロジェクトから145（70％）の正しいセキュリティ関連のテストケースを返し、完全な精度と0.83 F0.5スコアを達成しました。
マッチングタスクに対して十分に良好なパフォーマンスを示しているにもかかわらず、つまり0.86精度と0.68 F0.5スコア – Vutecoは、野生で有効な一致を取得できませんでした。
それにもかかわらず、私たちは、ほぼすべての試合で、間違った脆弱性と一致しているにもかかわらず、テストケースは依然としてセキュリティ関連であることを観察しました。
最終的に、Vutecoは、適切な脆弱性とのマッチングはまだ解決されていませんが、脆弱性を意見のテストの見つけるのに役立ちます。
得られた調査結果は、この問題に関する将来の研究のために足がかりの石を産みました。

要約(オリジナル)

Software vulnerabilities are commonly detected via static analysis, penetration testing, and fuzzing. They can also be found by running unit tests – so-called vulnerability-witnessing tests – that stimulate the security-sensitive behavior with crafted inputs. Developing such tests is difficult and time-consuming; thus, automated data-driven approaches could help developers intercept vulnerabilities earlier. However, training and validating such approaches require a lot of data, which is currently scarce. This paper introduces VUTECO, a deep learning-based approach for collecting instances of vulnerability-witnessing tests from Java repositories. VUTECO carries out two tasks: (1) the ‘Finding’ task to determine whether a test case is security-related, and (2) the ‘Matching’ task to relate a test case to the exact vulnerability it is witnessing. VUTECO successfully addresses the Finding task, achieving perfect precision and 0.83 F0.5 score on validated test cases in VUL4J and returning 102 out of 145 (70%) correct security-related test cases from 244 open-source Java projects. Despite showing sufficiently good performance for the Matching task – i.e., 0.86 precision and 0.68 F0.5 score – VUTECO failed to retrieve any valid match in the wild. Nevertheless, we observed that in almost all of the matches, the test case was still security-related despite being matched to the wrong vulnerability. In the end, VUTECO can help find vulnerability-witnessing tests, though the matching with the right vulnerability is yet to be solved; the findings obtained lay the stepping stone for future research on the matter.

arxiv情報

著者	Emanuele Iannone,Quang-Cuong Bui,Riccardo Scandariato
発行日	2025-02-05 17:02:42+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.CR, cs.LG, cs.SE, D.2.5 | コメントを受け付けていません

Rethinking Approximate Gaussian Inference in Classification

投稿日: 2025年2月6日作成者: jarxiv

要約

分類タスクでは、SoftMax関数は、予測確率を生成するための出力アクティベーションとして遍在的に使用されます。
このような出力は、aleatoricの不確実性のみをキャプチャします。
認識論的不確実性を捉えるために、ロジット空間にガウス分布を出力する近似ガウス推論方法が提案されています。
その後、予測力は、ソフトマックスを通して前進するガウス分布の期待として得られます。
ただし、このようなソフトマックスガウス積分は分析的に解決することはできず、モンテカルロ（MC）近似は費用がかかり、うるさいことがあります。
学習目標の単純な変更を提案します。これにより、予測装置の正確な計算を可能にし、ランタイムやメモリのオーバーヘッドなしで改善されたトレーニングダイナミクスを享受します。
このフレームワークは、ソフトマックスを含む出力活性化関数のファミリー、および要素ごとのNORMCDFおよびSIGMOIDと互換性があります。
さらに、分析モーメントマッチングにより、ガウスをDirichlet分布でプッシュフォワードすることができます。
大規模および小規模データセット（Imagenet、CIFAR-10）で、いくつかの近似ガウス推論方法（Laplace、Het、SNGP）と組み合わせたアプローチを評価し、ソフトマックスMCサンプリングと比較して不確実性の定量化機能の改善を示します。
コードはhttps://github.com/bmucsanyi/probitで入手できます。

要約(オリジナル)

In classification tasks, softmax functions are ubiquitously used as output activations to produce predictive probabilities. Such outputs only capture aleatoric uncertainty. To capture epistemic uncertainty, approximate Gaussian inference methods have been proposed, which output Gaussian distributions over the logit space. Predictives are then obtained as the expectations of the Gaussian distributions pushed forward through the softmax. However, such softmax Gaussian integrals cannot be solved analytically, and Monte Carlo (MC) approximations can be costly and noisy. We propose a simple change in the learning objective which allows the exact computation of predictives and enjoys improved training dynamics, with no runtime or memory overhead. This framework is compatible with a family of output activation functions that includes the softmax, as well as element-wise normCDF and sigmoid. Moreover, it allows for approximating the Gaussian pushforwards with Dirichlet distributions by analytic moment matching. We evaluate our approach combined with several approximate Gaussian inference methods (Laplace, HET, SNGP) on large- and small-scale datasets (ImageNet, CIFAR-10), demonstrating improved uncertainty quantification capabilities compared to softmax MC sampling. Code is available at https://github.com/bmucsanyi/probit.

arxiv情報

著者	Bálint Mucsányi,Nathaël Da Costa,Philipp Hennig
発行日	2025-02-05 17:03:49+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, stat.ML | コメントを受け付けていません

SyMANTIC: An Efficient Symbolic Regression Method for Interpretable and Parsimonious Model Discovery in Science and Beyond

投稿日: 2025年2月6日作成者: jarxiv

要約

Symbolic Regression（SR）は、データからシンプルで解釈可能な数学的表現を発見することに焦点を当てた機械学習の新たな分野です。
SRメソッドの大幅な数が開発されていますが、多くの場合、計算コストが高い、入力寸法の数、ノイズへの脆弱性、精度と複雑さのバランスが取れないことに関するスケーラビリティが低いなどの課題に直面しています。
この作業では、これらの課題に対処する新しいSRアルゴリズムであるSymanticを紹介します。
相互情報ベースの機能選択の一意の組み合わせを通じて、大規模な候補（$ \ sim 10^5 $ \ sim 10^{10} {10} $以上）から（潜在的に複数の）低次元記述子を効率的に識別します。
、適応機能の拡張、および再帰的に適用された$ \ ell_0 $ベースのスパース回帰。
さらに、情報理論的尺度を採用して、おおよそのパレート最適方程式のセットを生成し、それぞれが特定の複雑さに対して最も発見された精度を提供します。
さらに、Pytorchエコシステムに基づいて構築されたSymanticのオープンソースの実装により、簡単なインストールとGPU加速が促進されます。
合成例、科学的ベンチマーク、現実世界の材料プロパティ予測、小さなデータセットからの混oticとした動的システム識別など、さまざまな問題にわたるシトリンチックの有効性を実証します。
大規模な比較によると、Symanticは、既存のSRメソッドのコストのほんの一部で同様のまたはより正確なモデルを明らかにしていることが示されています。

要約(オリジナル)

Symbolic regression (SR) is an emerging branch of machine learning focused on discovering simple and interpretable mathematical expressions from data. Although a wide-variety of SR methods have been developed, they often face challenges such as high computational cost, poor scalability with respect to the number of input dimensions, fragility to noise, and an inability to balance accuracy and complexity. This work introduces SyMANTIC, a novel SR algorithm that addresses these challenges. SyMANTIC efficiently identifies (potentially several) low-dimensional descriptors from a large set of candidates (from $\sim 10^5$ to $\sim 10^{10}$ or more) through a unique combination of mutual information-based feature selection, adaptive feature expansion, and recursively applied $\ell_0$-based sparse regression. In addition, it employs an information-theoretic measure to produce an approximate set of Pareto-optimal equations, each offering the best-found accuracy for a given complexity. Furthermore, our open-source implementation of SyMANTIC, built on the PyTorch ecosystem, facilitates easy installation and GPU acceleration. We demonstrate the effectiveness of SyMANTIC across a range of problems, including synthetic examples, scientific benchmarks, real-world material property predictions, and chaotic dynamical system identification from small datasets. Extensive comparisons show that SyMANTIC uncovers similar or more accurate models at a fraction of the cost of existing SR methods.

arxiv情報

著者	Madhav R. Muthyala,Farshud Sorourifar,You Peng,Joel A. Paulson
発行日	2025-02-05 17:05:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Energy-Efficient Flying LoRa Gateways: A Multi-Agent Reinforcement Learning Approach

投稿日: 2025年2月6日作成者: jarxiv

要約

次世代のモノのインターネット（NG-OIT）ネットワークの急速な発展により、接続されたデバイスの数が増えているため、消費電力が急増しました。
このエネルギー需要の上昇は、リソースの可用性に大きな課題をもたらし、大規模なIoT展開の持続可能性の懸念を引き起こします。
したがって、コミュニケーションネットワーク、特に電力制約のIoTデバイスの効率的なエネルギー利用は、研究の重要な分野になりました。
このホワイトペーパーでは、無人航空機（UAV）に取り付けられたフライングロラゲートウェイ（GWS）を展開して、Lora Endデバイス（EDS）からデータを収集し、中央サーバーに送信しました。
私たちの主な目的は、伝送電力（TP）の共同最適化（SF）、帯域幅（W）、およびED Associationの共同最適化により、ワイヤレスLORAネットワークのグローバルシステムエネルギー効率（EE）を最大化することです。
この挑戦的な問題を解決するために、問題を部分的に観察可能なマルコフ決定プロセス（POMDP）としてモデル化します。各フライングロラGWは、集中トレーニングと分散型実行（MARL）アプローチ（MARL）アプローチを使用して学習エージェントとして機能します（MARL）
CTDE）。
シミュレーション結果は、多因子近位政策最適化（MAPPO）アルゴリズムに基づいて提案された方法が、グローバルシステムEEを大幅に改善し、従来のMARLスキームを上回ることを示しています。

要約(オリジナル)

With the rapid development of next-generation Internet of Things (NG-IoT) networks, the increasing number of connected devices has led to a surge in power consumption. This rise in energy demand poses significant challenges to resource availability and raises sustainability concerns for large-scale IoT deployments. Efficient energy utilization in communication networks, particularly for power-constrained IoT devices, has thus become a critical area of research. In this paper, we deployed flying LoRa gateways (GWs) mounted on unmanned aerial vehicles (UAVs) to collect data from LoRa end devices (EDs) and transmit it to a central server. Our primary objective is to maximize the global system energy efficiency (EE) of wireless LoRa networks by joint optimization of transmission power (TP), spreading factor (SF), bandwidth (W), and ED association. To solve this challenging problem, we model the problem as a partially observable Markov decision process (POMDP), where each flying LoRa GW acts as a learning agent using a cooperative Multi-Agent Reinforcement Learning (MARL) approach under centralized training and decentralized execution (CTDE). Simulation results demonstrate that our proposed method, based on the multi-agent proximal policy optimization (MAPPO) algorithm, significantly improves the global system EE and surpasses the conventional MARL schemes.

arxiv情報

著者	Abdullahi Isa Ahmed,El Mehdi Amhoud
発行日	2025-02-05 17:16:40+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, cs.NI | コメントを受け付けていません

A Structured Reasoning Framework for Unbalanced Data Classification Using Probabilistic Models

投稿日: 2025年2月6日作成者: jarxiv

要約

このペーパーでは、不均衡なバイアスの問題と不均一なクラス分布のある環境での従来の機械学習モデルの少数派クラス認識能力の不十分な問題を解決することを目的とした、不均衡なデータのマルコフネットワークモデルを研究します。
共同確率分布と条件付き依存性を構築することにより、モデルはサンプルカテゴリのグローバルモデリングと推論最適化を実現できます。
この研究では、限界確率推定と加重損失最適化戦略を導入し、正則化の制約と構造化された推論方法を組み合わせて、モデルの一般化能力と堅牢性を効果的に改善しました。
実験段階では、実際のクレジットカード詐欺検出データセットが選択され、ロジスティック回帰、サポートベクターマシン、ランダムフォレスト、XGBoostなどのモデルと比較されました。
実験結果は、マルコフネットワークが加重精度、F1スコア、AUC-ROCなどのインジケーターでうまく機能し、従来の分類モデルを大幅に上回ることを示しており、不均衡なデータシナリオにおけるその強力な意思決定能力と適用性を示しています。
将来の研究は、効率的なモデルトレーニング、構造的最適化、および大規模な不均衡なデータ環境における深い学習統合に焦点を当て、金融リスク管理、医療診断、インテリジェントモニタリングなどの実際のアプリケーションでの幅広いアプリケーションを促進することができます。

要約(オリジナル)

This paper studies a Markov network model for unbalanced data, aiming to solve the problems of classification bias and insufficient minority class recognition ability of traditional machine learning models in environments with uneven class distribution. By constructing joint probability distribution and conditional dependency, the model can achieve global modeling and reasoning optimization of sample categories. The study introduced marginal probability estimation and weighted loss optimization strategies, combined with regularization constraints and structured reasoning methods, effectively improving the generalization ability and robustness of the model. In the experimental stage, a real credit card fraud detection dataset was selected and compared with models such as logistic regression, support vector machine, random forest and XGBoost. The experimental results show that the Markov network performs well in indicators such as weighted accuracy, F1 score, and AUC-ROC, significantly outperforming traditional classification models, demonstrating its strong decision-making ability and applicability in unbalanced data scenarios. Future research can focus on efficient model training, structural optimization, and deep learning integration in large-scale unbalanced data environments and promote its wide application in practical applications such as financial risk control, medical diagnosis, and intelligent monitoring.

arxiv情報

著者	Junliang Du,Shiyu Dou,Bohuan Yang,Jiacheng Hu,Tai An
発行日	2025-02-05 17:20:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Explain Yourself, Briefly! Self-Explaining Neural Networks with Concise Sufficient Reasons

投稿日: 2025年2月6日作成者: jarxiv

要約

十分な十分な理由は、説明の一般的な形式を表しています。これは、対応する値に一定に保持されている場合、予測が変更されないことを確認する入力機能の最小サブセットです。
以前の事後の方法は、そのような説明を取得しようとしますが、2つの主な制限に直面します。（1）これらのサブセットを取得することは計算上の課題をもたらし、最もスケーラブルな方法を最適でない意味のないサブセットに収束させます。
（2）これらのメソッドは、分散不足の入力割り当てのサンプリングに大きく依存しており、潜在的に直感に反する動作をもたらします。
これらの制限に取り組むために、この作業では、自己教師のトレーニングアプローチを提案します。これは *十分なサブセットトレーニング *（SST）と呼ばれます。
SSTを使用して、モデルをトレーニングして、出力の不可欠な部分として予測の簡潔な十分な理由を生成します。
私たちの結果は、私たちのフレームワークが、競合する事後の方法よりも簡潔で忠実なサブセットが大幅に効率的に生成され、同等の予測パフォーマンスを維持することを示しています。

要約(オリジナル)

Minimal sufficient reasons represent a prevalent form of explanation – the smallest subset of input features which, when held constant at their corresponding values, ensure that the prediction remains unchanged. Previous post-hoc methods attempt to obtain such explanations but face two main limitations: (1) Obtaining these subsets poses a computational challenge, leading most scalable methods to converge towards suboptimal, less meaningful subsets; (2) These methods heavily rely on sampling out-of-distribution input assignments, potentially resulting in counterintuitive behaviors. To tackle these limitations, we propose in this work a self-supervised training approach, which we term *sufficient subset training* (SST). Using SST, we train models to generate concise sufficient reasons for their predictions as an integral part of their output. Our results indicate that our framework produces succinct and faithful subsets substantially more efficiently than competing post-hoc methods, while maintaining comparable predictive performance.

arxiv情報

著者	Shahaf Bassan,Shlomit Gur,Ron Eliav
発行日	2025-02-05 17:29:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, cs.LO | コメントを受け付けていません

CAPE: Covariate-Adjusted Pre-Training for Epidemic Time Series Forecasting

投稿日: 2025年2月6日作成者: jarxiv

要約

流行感染症の軌跡の正確な予測は、公衆衛生を保護するために重要です。
ただし、発生する発生中のデータの可用性は限られており、環境要因と疾患のダイナミクスとの複雑な相互作用は、効果的な予測に大きな課題をもたらします。
これに応じて、多様な地域からの広範な疾患データセットを活用し、環境要因をモデリングプロセスに直接統合するために、下流疾患に関するより情報に基づいた意思決定をモデリングプロセスに直接統合するために設計された新しい流行前の訓練フレームワークであるCapeを紹介します。
共変量調整フレームワークに基づいて、ケープは、潜在的な環境の影響を推定しながら、病気の普遍的なパターンを特定するために、階層環境と組み合わせたトレーニング前のトレーニングを利用します。
流行時系列データセットの多様なコレクションをまとめて、フルショット、少数のショット、ゼロショット、クロスロケーション、クロスディジーシーズ設定など、さまざまな評価シナリオでケープの有効性を検証しました。
フルショットで平均9.9％、ゼロショット設定で14.3％のベースライン。
コードは受け入れられるとリリースされます。

要約(オリジナル)

Accurate forecasting of epidemic infection trajectories is crucial for safeguarding public health. However, limited data availability during emerging outbreaks and the complex interaction between environmental factors and disease dynamics present significant challenges for effective forecasting. In response, we introduce CAPE, a novel epidemic pre-training framework designed to harness extensive disease datasets from diverse regions and integrate environmental factors directly into the modeling process for more informed decision-making on downstream diseases. Based on a covariate adjustment framework, CAPE utilizes pre-training combined with hierarchical environment contrasting to identify universal patterns across diseases while estimating latent environmental influences. We have compiled a diverse collection of epidemic time series datasets and validated the effectiveness of CAPE under various evaluation scenarios, including full-shot, few-shot, zero-shot, cross-location, and cross-disease settings, where it outperforms the leading baseline by an average of 9.9% in full-shot and 14.3% in zero-shot settings. The code will be released upon acceptance.

arxiv情報

著者	Zewen Liu,Juntong Ni,Max S. Y. Lau,Wei Jin
発行日	2025-02-05 17:29:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Detecting Strategic Deception Using Linear Probes

投稿日: 2025年2月6日作成者: jarxiv

要約

AIモデルは、スキームまたは誤った動作の一部として欺ceptive戦略を使用する場合があります。
AIは、内部の推論が誤って調整されている間、AIが一見良性の出力を生成する可能性があるため、出力のみを監視することは不十分です。
したがって、線形プローブがモデルの活性化を監視することにより、欺ceptionを堅牢に検出できるかどうかを評価します。
2つのプローブトレーニングデータセットをテストします。1つは、正直または欺cept的であると対照的な指示を備えており（Zou et al。、2023に続いて）、単純なロールプレイングシナリオへの応答の1つです。
これらのプローブは、インサイダー取引（Scheurer et al。、2023）や意図的に安全評価のパフォーマンスが低いなど、Llama-3.3-70b-instructが誤って動作する現実的な設定に一般化するかどうかをテストします（Benton et al。、2024）。
私たちのプローブは、評価データセットで0.96〜0.999のAurocsを使用して、正直で欺cept的な応答を区別していることがわかります。
欺ceptionに関連しないチャットデータに1％の偽陽性率を持つように決定のしきい値を設定した場合、私たちのプローブは欺cept的な反応の95〜99％をキャッチします。
全体として、ホワイトボックスプローブは将来の監視システムに有望であると考えていますが、現在のパフォーマンスは欺ceptionに対する強固な防御として不十分です。
プローブの出力は、data.apolloresearch.ai/ddで表示できます。

要約(オリジナル)

AI models might use deceptive strategies as part of scheming or misaligned behaviour. Monitoring outputs alone is insufficient, since the AI might produce seemingly benign outputs while their internal reasoning is misaligned. We thus evaluate if linear probes can robustly detect deception by monitoring model activations. We test two probe-training datasets, one with contrasting instructions to be honest or deceptive (following Zou et al., 2023) and one of responses to simple roleplaying scenarios. We test whether these probes generalize to realistic settings where Llama-3.3-70B-Instruct behaves deceptively, such as concealing insider trading (Scheurer et al., 2023) and purposely underperforming on safety evaluations (Benton et al., 2024). We find that our probe distinguishes honest and deceptive responses with AUROCs between 0.96 and 0.999 on our evaluation datasets. If we set the decision threshold to have a 1% false positive rate on chat data not related to deception, our probe catches 95-99% of the deceptive responses. Overall we think white-box probes are promising for future monitoring systems, but current performance is insufficient as a robust defence against deception. Our probes’ outputs can be viewed at data.apolloresearch.ai/dd and our code at github.com/ApolloResearch/deception-detection.

arxiv情報

著者	Nicholas Goldowsky-Dill,Bilal Chughtai,Stefan Heimersheim,Marius Hobbhahn
発行日	2025-02-05 17:49:40+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント