jarxiv | Japanese arxiv | ページ 461

Explaining Strategic Decisions in Multi-Agent Reinforcement Learning for Aerial Combat Tactics

投稿日: 2025年5月19日作成者: jarxiv

要約

人工知能（AI）は、複雑なシナリオで自律エージェント間の調整を可能にするマルチエージェント強化学習（MARL）を使用して、戦略的計画を再構築しています。
ただし、デリケートな軍事的文脈における実際の展開は、説明可能性の欠如によって制約されています。これは、人間の戦略との信頼、安全性、および整合の重要な要因です。
この作業は、シミュレートされた空気戦闘シナリオに焦点を当てて、MARLの説明可能性方法の現在の進歩をレビューおよび評価します。
さまざまな説明可能性の手法をさまざまな空中戦闘シナリオに適応させて、モデルの動作に関する説明的な洞察を得ることに進みます。
AIに生成された戦術を人間に理解できない推論とリンクすることにより、信頼できる展開と意味のある人間の相互作用を確保するための透明性の必要性を強調します。
運用上の防御のためにMARLを前進させる際の説明可能性の重要な重要性を明らかにすることにより、私たちの仕事は戦略的計画だけでなく、洞察に満ちた包括的な分析で軍人の訓練もサポートしています。

要約(オリジナル)

Artificial intelligence (AI) is reshaping strategic planning, with Multi-Agent Reinforcement Learning (MARL) enabling coordination among autonomous agents in complex scenarios. However, its practical deployment in sensitive military contexts is constrained by the lack of explainability, which is an essential factor for trust, safety, and alignment with human strategies. This work reviews and assesses current advances in explainability methods for MARL with a focus on simulated air combat scenarios. We proceed by adapting various explainability techniques to different aerial combat scenarios to gain explanatory insights about the model behavior. By linking AI-generated tactics with human-understandable reasoning, we emphasize the need for transparency to ensure reliable deployment and meaningful human-machine interaction. By illuminating the crucial importance of explainability in advancing MARL for operational defense, our work supports not only strategic planning but also the training of military personnel with insightful and comprehensible analyses.

arxiv情報

著者	Ardian Selmonaj,Alessandro Antonucci,Adrian Schneider,Michael Rüegsegger,Matthias Sommer
発行日	2025-05-16 14:36:30+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG, cs.MA | コメントを受け付けていません

On the Feasibility of Using LLMs to Autonomously Execute Multi-host Network Attacks

投稿日: 2025年5月19日作成者: jarxiv

要約

LLMは、一部のセキュリティタスクとCTFの課題で予備的な約束を示しています。
実際のサイバー攻撃は、多くの場合、マルチホストネットワーク攻撃であり、偵察の実施、脆弱性の悪用、妥協したホストを使用してデータを除去するなど、複数のホストで多くのステップを実行することが含まれます。
これまで、LLMSがマルチホストネットワーク攻撃を自律的に実行できる程度はよく理解されていません。
この目的のために、私たちの最初の貢献は、10のリアルなエミュレートネットワーク（25〜50ホスト）を備えたオープンソースのマルチホスト攻撃ベンチマークであるMHBenchです。
最新のセキュリティ関連プロンプト戦略を備えた最新の推論モデル（GPT4O、GEMINI 2.5 Pro、Sonnet 3.7思考）を含む人気のあるLLM（例：Pentestgpt、Cyberseceval3）は、マルチホストネットワーク攻撃を自律的に実行できないことがわかります。
LLMがそのような攻撃を自律的に実行できるようにするために、2番目の貢献は高レベルの抽象化層であるIncalmoです。
Incalmoにより、LLMは高レベルのアクションを指定できます（たとえば、ホストに感染し、ネットワークをスキャンします）。
Incalmoの翻訳層は、これらのアクションを専門家エージェントを介して低レベルのプリミティブ（たとえば、ツールを悪用するためのコマンド）に変換します。
MHBenchの10のネットワークのうち9つで、Incalmoを使用したLLMSは、少なくとも攻撃目標の一部を達成しています。
Incalmoを装備したより小さなLLM（例：Haiku 3.5、Gemini 2 Flash）は、10の環境のうち5つの環境ですべての目標を達成しています。
また、LLMがそのような攻撃を自律的に実行できるようにする際のIncalmoの抽象化における高レベルのアクションの重要な役割を検証します。

要約(オリジナル)

LLMs have shown preliminary promise in some security tasks and CTF challenges. Real cyberattacks are often multi-host network attacks, which involve executing a number of steps across multiple hosts such as conducting reconnaissance, exploiting vulnerabilities, and using compromised hosts to exfiltrate data. To date, the extent to which LLMs can autonomously execute multi-host network attacks} is not well understood. To this end, our first contribution is MHBench, an open-source multi-host attack benchmark with 10 realistic emulated networks (from 25 to 50 hosts). We find that popular LLMs including modern reasoning models (e.g., GPT4o, Gemini 2.5 Pro, Sonnet 3.7 Thinking) with state-of-art security-relevant prompting strategies (e.g., PentestGPT, CyberSecEval3) cannot autonomously execute multi-host network attacks. To enable LLMs to autonomously execute such attacks, our second contribution is Incalmo, an high-level abstraction layer. Incalmo enables LLMs to specify high-level actions (e.g., infect a host, scan a network). Incalmo’s translation layer converts these actions into lower-level primitives (e.g., commands to exploit tools) through expert agents. In 9 out of 10 networks in MHBench, LLMs using Incalmo achieve at least some of the attack goals. Even smaller LLMs (e.g., Haiku 3.5, Gemini 2 Flash) equipped with Incalmo achieve all goals in 5 of 10 environments. We also validate the key role of high-level actions in Incalmo’s abstraction in enabling LLMs to autonomously execute such attacks.

arxiv情報

著者	Brian Singer,Keane Lucas,Lakshmi Adiga,Meghna Jain,Lujo Bauer,Vyas Sekar
発行日	2025-05-16 14:55:52+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CR | コメントを受け付けていません

ImprovNet — Generating Controllable Musical Improvisations with Iterative Corruption Refinement

投稿日: 2025年5月19日作成者: jarxiv

要約

Deep Learningがさまざまなドメインにまたがるスタイル転送における顕著な進歩にもかかわらず、完全に象徴的に表現された音楽作品のための制御可能なパフォーマンスレベルの音楽スタイル転送を生成することは、挑戦的な研究分野です。
これの多くは、特にジャズなどのジャンルや、複数の音楽生成タスクを処理できる統一されたモデルがないため、限られたデータセットに債務があります。
このホワイトペーパーでは、自己監視された腐敗 – 修正トレーニング戦略を通じて表現力豊かで制御可能な音楽即興を生み出す変圧器ベースのアーキテクチャであるImprovnetを紹介します。
即興スタイルの転送は、ターゲットジャンルに対する元の構成のメロディー、ハーモニー、またはリズムなど、1つ以上の音楽要素を意味のある変更を加えることを目的としています。
Improvnetは、単一のモデル内で複数の機能を統合します。ジャンルとジャンル内の即興演奏を実行し、ジャンル固有のスタイルでメロディーを調和させ、短い迅速な継続と充填タスクを実行します。
モデルの反復的な生成フレームワークにより、ユーザーはスタイルの転送の程度と構造的類似性を元の構成と制御できます。
客観的で主観的な評価は、元の部分と構造的な関係を維持しながら、音楽的に一貫した即興演奏を生み出す際の即興の有効性を示しています。
このモデルは、短時間の継続および浸漬タスクで予測音楽トランスを上回り、認識可能なジャンル変換を成功裏に達成し、参加者の79％が古典的な作品のジャズスタイルの即興演奏を正しく識別します。
私たちのコードとデモのページは、https：//github.com/keshavbhandari/improvnetにあります。

要約(オリジナル)

Despite deep learning’s remarkable advances in style transfer across various domains, generating controllable performance-level musical style transfer for complete symbolically represented musical works remains a challenging area of research. Much of this is owed to limited datasets, especially for genres such as jazz, and the lack of unified models that can handle multiple music generation tasks. This paper presents ImprovNet, a transformer-based architecture that generates expressive and controllable musical improvisations through a self-supervised corruption-refinement training strategy. The improvisational style transfer is aimed at making meaningful modifications to one or more musical elements – melody, harmony or rhythm of the original composition with respect to the target genre. ImprovNet unifies multiple capabilities within a single model: it can perform cross-genre and intra-genre improvisations, harmonize melodies with genre-specific styles, and execute short prompt continuation and infilling tasks. The model’s iterative generation framework allows users to control the degree of style transfer and structural similarity to the original composition. Objective and subjective evaluations demonstrate ImprovNet’s effectiveness in generating musically coherent improvisations while maintaining structural relationships with the original pieces. The model outperforms Anticipatory Music Transformer in short continuation and infilling tasks and successfully achieves recognizable genre conversion, with 79\% of participants correctly identifying jazz-style improvisations of classical pieces. Our code and demo page can be found at https://github.com/keshavbhandari/improvnet.

arxiv情報

著者	Keshav Bhandari,Sungkyun Chang,Tongyu Lu,Fareza R. Enus,Louis B. Bradshaw,Dorien Herremans,Simon Colton
発行日	2025-05-16 14:56:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

A Radon-Nikodým Perspective on Anomaly Detection: Theory and Implications

投稿日: 2025年5月19日作成者: jarxiv

要約

効果的な異常検出損失関数の設計を支える原理はどれですか？
答えは、Radon-Nikod \ ‘Ym定理の概念にあります。これは、測定理論の基本的な概念です。
この記事の重要な洞察は、バニラ損失関数をRadon-nikod \ ‘ym誘導体に掛けると、ボード全体のパフォーマンスが向上します。
これをRN-Lossと呼びます。
これは、PACの設定（おそらくほぼ正しい）学習性を使用して証明します。
コンテキストに応じて、radon-nikod \ ‘ym誘導体は異なる形をとります。
監視された異常検出の最も単純なケースでは、Radon-nikod \ ‘ym誘導体は単純な加重損失の形をとります。
監視されていない異常検出の場合（分布の仮定を伴う）、radon-nikod \ ‘ym誘導体は、一般的なクラスターベースのローカル外れ値因子の形をとっています。
ヘルスケア、サイバーセキュリティ、金融を含む多様なドメインからの単変量および多変量データを含む96のデータセットでアルゴリズムを評価します。
RN誘導アルゴリズムは、多変量データセットの68％（F1スコアに基づく）で最先端の方法を上回り、時系列（単変量）データセットの72％でPEAK F1スコアを達成することを示します。

要約(オリジナル)

Which principle underpins the design of an effective anomaly detection loss function? The answer lies in the concept of Radon-Nikod\’ym theorem, a fundamental concept in measure theory. The key insight from this article is: Multiplying the vanilla loss function with the Radon-Nikod\’ym derivative improves the performance across the board. We refer to this as RN-Loss. We prove this using the setting of PAC (Probably Approximately Correct) learnability. Depending on the context a Radon-Nikod\’ym derivative takes different forms. In the simplest case of supervised anomaly detection, Radon-Nikod\’ym derivative takes the form of a simple weighted loss. In the case of unsupervised anomaly detection (with distributional assumptions), Radon-Nikod\’ym derivative takes the form of the popular cluster based local outlier factor. We evaluate our algorithm on 96 datasets, including univariate and multivariate data from diverse domains, including healthcare, cybersecurity, and finance. We show that RN-Derivative algorithms outperform state-of-the-art methods on 68% of Multivariate datasets (based on F1 scores) and also achieves peak F1-scores on 72% of time series (Univariate) datasets.

arxiv情報

著者	Shlok Mehendale,Aditya Challa,Rahul Yedida,Sravan Danda,Santonu Sarkar,Snehanshu Saha
発行日	2025-05-16 15:04:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

DecompileBench: A Comprehensive Benchmark for Evaluating Decompilers in Real-World Scenarios

投稿日: 2025年5月19日作成者: jarxiv

要約

逆コンパイラは、脆弱性の発見からマルウェア分析まで、重要なセキュリティタスクの基本的なツールですが、その評価は断片化されたままです。
既存のアプローチは、主に合成マイクロベンチマークまたは主観的な人間の評価を介して構文の正確性に焦点を当てており、セマンティックフィデリティとアナリストの使いやすさの現実世界の要件に対処できません。
3つの主要なコンポーネントを介したリバースエンジニアリングワークフローでの逆コンパイラの効果的な評価を可能にする最初の包括的なフレームワークであるDecompileBenchを紹介します：\ TextIT {Real-World Function Extraction}（130の実世界プログラムからの23,400関数を含む）、\ TextIT {Runtime-Ware velidation}、および\ textiT {automated assisment eを使用した{runtim-assicments easing asestricment {
リバースエンジニアリングワークフローにおける逆コンパイラの有効性を定量化します。
6つの産業強度逆コンパイラと最近の6つのLLM駆動型アプローチとの体系的な比較を通じて、LLMベースの方法は、52.2％低い機能の正確性にもかかわらず、コードの理解可能性の商用ツールを上回ることを実証します。
これらの調査結果は、人間中心のリバースエンジニアリングを変換するLLMベースのアプローチの可能性を強調しています。
オープンソース\ href {https://github.com/jennieett/decompilebench} {decompilebench} {decompilebench} decompilersの研究を進め、セキュリティの専門家が特定の要件に基づいて情報に基づいたツール選択を行うのを支援します。

要約(オリジナル)

Decompilers are fundamental tools for critical security tasks, from vulnerability discovery to malware analysis, yet their evaluation remains fragmented. Existing approaches primarily focus on syntactic correctness through synthetic micro-benchmarks or subjective human ratings, failing to address real-world requirements for semantic fidelity and analyst usability. We present DecompileBench, the first comprehensive framework that enables effective evaluation of decompilers in reverse engineering workflows through three key components: \textit{real-world function extraction} (comprising 23,400 functions from 130 real-world programs), \textit{runtime-aware validation}, and \textit{automated human-centric assessment} using LLM-as-Judge to quantify the effectiveness of decompilers in reverse engineering workflows. Through a systematic comparison between six industrial-strength decompilers and six recent LLM-powered approaches, we demonstrate that LLM-based methods surpass commercial tools in code understandability despite 52.2% lower functionality correctness. These findings highlight the potential of LLM-based approaches to transform human-centric reverse engineering. We open source \href{https://github.com/Jennieett/DecompileBench}{DecompileBench} to provide a framework to advance research on decompilers and assist security experts in making informed tool selections based on their specific requirements.

arxiv情報

著者	Zeyu Gao,Yuxin Cui,Hao Wang,Siliang Qin,Yuanda Wang,Bolun Zhang,Chao Zhang
発行日	2025-05-16 15:07:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.SE | コメントを受け付けていません

Flex: End-to-End Text-Instructed Visual Navigation from Foundation Model Features

投稿日: 2025年5月19日作成者: jarxiv

要約

エンドツーエンドの学習は、感覚入力をアクションに直接マッピングし、複雑なロボットタスクの高度に統合された効率的なポリシーを作成します。
ただし、このようなモデルは、トレーニングシナリオを超えて一般化するのに苦労し、新しい環境、タスク、概念への適応性を制限します。
この作業では、目に見えないテキスト命令と視覚的な分布シフトの下で、ビジョンベースの制御ポリシーを使用して堅牢な閉ループパフォーマンスを実現するために必要な最小限のデータ要件とアーキテクチャの適応を調査します。
私たちの調査結果は、Flex（lexivally）で合成されます。これは、フローズンパッチワイズの特徴抽出器として事前に訓練されたビジョン言語モデル（VLM）を使用するフレームワークであり、セマンティック情報と視覚情報を統合する空間的に認識された埋め込みを生成します。
このアプローチの有効性は、小さなシミュレートされたデータセットでクローニングする動作を介して訓練されたエージェントが、多様な斬新な目標とコマンドの定式化を備えた実際のシーンに正常に一般化する動作を介してトレーニングしました。

要約(オリジナル)

End-to-end learning directly maps sensory inputs to actions, creating highly integrated and efficient policies for complex robotics tasks. However, such models often struggle to generalize beyond their training scenarios, limiting adaptability to new environments, tasks, and concepts. In this work, we investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies under unseen text instructions and visual distribution shifts. Our findings are synthesized in Flex (Fly lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors, generating spatially aware embeddings that integrate semantic and visual information. We demonstrate the effectiveness of this approach on a quadrotor fly-to-target task, where agents trained via behavior cloning on a small simulated dataset successfully generalize to real-world scenes with diverse novel goals and command formulations.

arxiv情報

著者	Makram Chahine,Alex Quach,Alaa Maalouf,Tsun-Hsuan Wang,Daniela Rus
発行日	2025-05-16 15:13:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: 68T05, 68T40, 68T50, cs.AI, cs.RO, I.2.10 | コメントを受け付けていません

Uncertainty Quantification for LLM-Based Survey Simulations

投稿日: 2025年5月19日作成者: jarxiv

要約

調査の質問に対する人間の反応をシミュレートし、信頼できる洞察を得るために不確実性の定量化を実施するために、大規模な言語モデル（LLM）の使用を調査します。
私たちのアプローチは、不完全なLLMシミュレーション応答を、人間の反応の人口パラメーターの信頼セットに変換し、シミュレートされた集団と実際の集団間の分布シフトに対処します。
主要な革新は、シミュレートされた応答の最適数を決定することにあります。あまりにも多くの生成が多すぎると、カバレッジが不十分な狭い信頼性セットがありますが、少なすぎると過度にゆるい推定値が得られます。
これを解決するために、当社の方法はシミュレーションサンプルサイズを適応的に選択し、有効な平均ケースカバレッジ保証を保証します。
それは、その忠実度や信頼セットを構築する手順に関係なく、あらゆるLLMに広く適用されます。
さらに、選択されたサンプルサイズは、LLMとターゲットのヒト集団との間の不整合の程度を定量化します。
実際のデータセットとLLMでの方法を説明します。

要約(オリジナル)

We investigate the use of large language models (LLMs) to simulate human responses to survey questions, and perform uncertainty quantification to gain reliable insights. Our approach converts imperfect LLM-simulated responses into confidence sets for population parameters of human responses, addressing the distribution shift between the simulated and real populations. A key innovation lies in determining the optimal number of simulated responses: too many produce overly narrow confidence sets with poor coverage, while too few yield excessively loose estimates. To resolve this, our method adaptively selects the simulation sample size, ensuring valid average-case coverage guarantees. It is broadly applicable to any LLM, irrespective of its fidelity, and any procedure for constructing confidence sets. Additionally, the selected sample size quantifies the degree of misalignment between the LLM and the target human population. We illustrate our method on real datasets and LLMs.

arxiv情報

著者	Chengpiao Huang,Yuhang Wu,Kaizheng Wang
発行日	2025-05-16 15:19:23+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG, stat.ME | コメントを受け付けていません

Head-Tail-Aware KL Divergence in Knowledge Distillation for Spiking Neural Networks

投稿日: 2025年5月19日作成者: jarxiv

要約

スパイクニューラルネットワーク（SNN）は、エネルギー効率が高く生物学的にもっともらしい計算のための有望なアプローチとして浮上しています。
ただし、既存のトレーニング方法の制限と固有のモデルの制約により、SNNは人工ニューラルネットワーク（ANN）と比較した場合、パフォーマンスのギャップを示すことがよくあります。
知識蒸留（KD）は、このギャップを軽減するために、ANN教師モデルからSNN学生モデルに知識を移転する手法として調査されています。
従来のKDメソッドは通常、Kullback-Leibler（KL）発散を使用して出力分布を調整します。
ただし、従来のKLベースのアプローチは、SNNのユニークな特性を完全に活用することができません。これは、低確率の予測を無視しながら高プロビーズ性の予測を過度に強調し、最適ではない一般化につながるためです。
これに対処するために、SNNSの新しいKDメソッドであるHead-Tail Aware Kullback-Leibler（HTA-KL）Divergenceを提案します。
HTA-KLは、累積確率ベースのマスクを導入して、高速度領域と低確率領域を動的に区別します。
バランスの取れた知識移転を確保し、全体的なパフォーマンスを向上させるために、適応ウェイトを割り当てます。
フォワードKL（FKL）とリバースKL（RKL）の発散を統合することにより、私たちの方法は、分布のヘッド領域とテール領域の両方を効果的に整列させます。
CIFAR-10、CIFAR-100、および小さなImagenetデータセットでの方法を評価します。
この方法は、タイムステップが少ないほとんどのデータセットで既存のメソッドを上回っています。

要約(オリジナル)

Spiking Neural Networks (SNNs) have emerged as a promising approach for energy-efficient and biologically plausible computation. However, due to limitations in existing training methods and inherent model constraints, SNNs often exhibit a performance gap when compared to Artificial Neural Networks (ANNs). Knowledge distillation (KD) has been explored as a technique to transfer knowledge from ANN teacher models to SNN student models to mitigate this gap. Traditional KD methods typically use Kullback-Leibler (KL) divergence to align output distributions. However, conventional KL-based approaches fail to fully exploit the unique characteristics of SNNs, as they tend to overemphasize high-probability predictions while neglecting low-probability ones, leading to suboptimal generalization. To address this, we propose Head-Tail Aware Kullback-Leibler (HTA-KL) divergence, a novel KD method for SNNs. HTA-KL introduces a cumulative probability-based mask to dynamically distinguish between high- and low-probability regions. It assigns adaptive weights to ensure balanced knowledge transfer, enhancing the overall performance. By integrating forward KL (FKL) and reverse KL (RKL) divergence, our method effectively align both head and tail regions of the distribution. We evaluate our methods on CIFAR-10, CIFAR-100 and Tiny ImageNet datasets. Our method outperforms existing methods on most datasets with fewer timesteps.

arxiv情報

著者	Tianqing Zhang,Zixin Zhu,Kairong Yu,Hongwei Wang
発行日	2025-05-16 15:19:42+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI | コメントを受け付けていません

Wavelet Analysis of Noninvasive EEG Signals Discriminates Complex and Natural Grasp Types

投稿日: 2025年5月19日作成者: jarxiv

要約

この研究の目的は、特に運動障害のある患者向けの、器用な神経形質発達および脳コンピューター界面（BCI）アプリケーションの脳波（EEG）からの手握りを解読することを目的としています。
特に、新しいEEGベースのBCIプラットフォームとウェーブレット信号処理を使用した動きのない状態として、2つの複雑な自然なパワーと精密な把握を区別することに焦点を当てています。
ウェーブレット分析には、ウェーブレットパワー係数から時間周波数および地形マップの生成が含まれます。
次に、新しいウェーブレット機能を備えた機械学習技術を使用することにより、マルチクラスで85.16％、動きのないvsパワーで95.37％、動きのないvs精度で95.40％、パワー対精密な88.07％を達成し、これらの特徴に基づくグラスの変化におけるこれらの特徴の有効性を実証しました。
以前の研究とは対照的に、私たちの研究の重要な部分は、順列特徴の重要性分析でした。これは、分類を把握するための重要な機能を強調しました。
把握中の最も重要な脳活動は、アルファおよびベータ周波数帯域内の運動皮質で発生することを明らかにしました。
これらの洞察は、リアルタイムの神経形質技術とBCIアプリケーションにおけるウェーブレットの特徴の可能性を示しています。

要約(オリジナル)

This research aims to decode hand grasps from Electroencephalograms (EEGs) for dexterous neuroprosthetic development and Brain-Computer Interface (BCI) applications, especially for patients with motor disorders. Particularly, it focuses on distinguishing two complex natural power and precision grasps in addition to a neutral condition as a no-movement condition using a new EEG-based BCI platform and wavelet signal processing. Wavelet analysis involved generating time-frequency and topographic maps from wavelet power coefficients. Then, by using machine learning techniques with novel wavelet features, we achieved high average accuracies: 85.16% for multiclass, 95.37% for No-Movement vs Power, 95.40% for No-Movement vs Precision, and 88.07% for Power vs Precision, demonstrating the effectiveness of these features in EEG-based grasp differentiation. In contrast to previous studies, a critical part of our study was permutation feature importance analysis, which highlighted key features for grasp classification. It revealed that the most crucial brain activities during grasping occur in the motor cortex, within the alpha and beta frequency bands. These insights demonstrate the potential of wavelet features in real-time neuroprosthetic technology and BCI applications.

arxiv情報

著者	Ali Rabiee,Sima Ghafoori,Anna Cetera,Reza Abiri
発行日	2025-05-16 15:20:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG, eess.SP, q-bio.NC | コメントを受け付けていません

Focus on the Likely: Test-time Instance-based Uncertainty Removal

投稿日: 2025年5月19日作成者: jarxiv

要約

私たちは尋ねます：モデルの予測を改善する可能性が高いと予測されるクラスに焦点を当てることはありますか？
不確実なモデルの予測を改善するために、2つの新しいテスト時間微調整方法を提案することにより、肯定的な答えを目指しています。
最も可能性の高いクラスを貪欲に選択する代わりに、予測を改善するために、追加のステップを紹介します。
理論的に動機付けられた単一勾配降下ステップを大きな学習率で適用することにより、初期のフォワードパスが高い不確実性を示したときに予測を改善します。
これにより、予測は、ゼロ確率をより妥当性の低い結果に割り当てる理想とより密接に合わせます。
実験的評価は、私たちの方法の1つの精度の向上を示しています。これは、多様なテキストおよび画像ドメインモデル全体で、可能性のあるクラス間で共有された機能を強調しています。
％当社の理論的議論は、より深い理解を提供し、（フォーカス）クラスの間で共有された共有機能と非共有機能のさまざまな影響を強調しています。
％私たちの議論はまた、標準のオフライントレーニングとテスト時間トレーニングに関する興味深い見解を示唆しています。各トレーニングフェーズでは、特徴依存の幅に関する幅の幅に関する最適化の理論的根拠が望ましいことが示唆されています。

要約(オリジナル)

We ask: Does focusing on classes predicted as likely improve model predictions? We aim for an affirmative answer by proposing two novel test-time fine-tuning methods to improve uncertain model predictions. Instead of greedily selecting the most likely class, we introduce an additional step, \emph{focus on the likely classes}, to refine predictions. By applying a theoretically motivated single gradient descent step with a large learning rate, we refine predictions when an initial forward pass indicates high uncertainty. This aligns predictions more closely with the ideal of assigning zero probability to less plausible outcomes. The experimental evaluation demonstrates accuracy gains for one of our methods, which emphasizes shared features among likely classes, across diverse text and image domain models. %Our theoretical discussion provides a deeper understanding, highlighting the varying impact of shared and non-shared features among (focus) classes. %Our discussion also suggests an interesting view on standard, offline training vs. test-time training: Opposing optimization rationales regarding breadth of feature dependence are preferable during each training phase.

arxiv情報

著者	Johannes Schneider
発行日	2025-05-16 15:21:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント