jarxiv | Japanese arxiv

A Hybrid Artificial Intelligence Method for Estimating Flicker in Power Systems

投稿日: 2025年6月17日作成者: jarxiv

要約

このペーパーでは、Hフィルタリングと適応型線形ニューロンネットワークを組み合わせた新しいハイブリッドAIメソッドを紹介します。配電システムのフリッカーコンポーネント推定のための適応線形ニューロンネットワーク。提案された方法は、Hフィルターの堅牢性を活用して、不確実性と騒々しい状態で電圧エンベロープを抽出し、アダリンの使用に続いて、エフェーブの吸入頻度で埋め込まれた頻度で埋め込まれた頻度を正確に識別します。
既存の周波数ドメインアプローチの重要な制限に対処する収束と騒音回復力。従来の手法のように、このハイブリッドAIモデルは、ノイズ特性や広範なトレーニングの事前知識なしに複雑な電力障害を処理します。
ウェーブレット変換ベースの推定器。

要約(オリジナル)

This paper introduces a novel hybrid AI method combining H filtering and an adaptive linear neuron network for flicker component estimation in power distribution systems.The proposed method leverages the robustness of the H filter to extract the voltage envelope under uncertain and noisy conditions followed by the use of ADALINE to accurately identify flicker frequencies embedded in the envelope.This synergy enables efficient time domain estimation with rapid convergence and noise resilience addressing key limitations of existing frequency domain approaches.Unlike conventional techniques this hybrid AI model handles complex power disturbances without prior knowledge of noise characteristics or extensive training.To validate the method performance we conduct simulation studies based on IEC Standard 61000 4 15 supported by statistical analysis Monte Carlo simulations and real world data.Results demonstrate superior accuracy robustness and reduced computational load compared to Fast Fourier Transform and Discrete Wavelet Transform based estimators.

arxiv情報

著者	Javad Enayati,Pedram Asef,Alexandre Benoit
発行日	2025-06-16 15:38:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.SY, eess.SY, stat.AP | コメントを受け付けていません

EBS-CFL: Efficient and Byzantine-robust Secure Clustered Federated Learning

投稿日: 2025年6月17日作成者: jarxiv

要約

連合学習（FL）の共同学習における可能性の可能性にもかかわらず、分散ユーザーのデータ不均一性により、そのパフォーマンスは悪化しています。
最近、Clustered Federated Learning（CFL）が登場し、ユーザーを類似性に応じてクラスターに分割することにより、この課題に対処することができました。
ただし、CFLは、ユーザーがプライバシーの懸念のためにクラスターのアイデンティティを共有したくない場合にトレーニングの困難に直面しています。
これらの問題に対処するために、EBS-CFLと呼ばれるCFLの革新的な効率的で堅牢な安全な集約スキームを提示します。
提案されているEBS-CFLは、ユーザーのクラスターのアイデンティティを秘密に維持しながら、CFLを効果的にトレーニングすることをサポートします。
さらに、負の相関勾配を破棄し、加重アプローチを使用して積極的に相関する勾配を凝集させることにより、個々のクライアント勾配を損なうことなく、潜在的な有毒攻撃を検出します。
サーバーはまた、クライアントによる正しいグラデーションエンコードを認証します。
EBS-CFLは、通信用のクライアント側のオーバーヘッドo（ml + m^2）、計算用のo（m^2L）で高効率を持ち、mはクラスターアイデンティティの数、lは勾配サイズです。
M = 1の場合、EBS-CFLのクライアントの計算効率は、少なくともo（log n）倍の比較スキームよりも優れています。ここで、nはクライアントの数です。追加では、広範な実験を通じてスキームを検証します。
最後に、理論的にはスキームのセキュリティを証明します。

要約(オリジナル)

Despite federated learning (FL)’s potential in collaborative learning, its performance has deteriorated due to the data heterogeneity of distributed users. Recently, clustered federated learning (CFL) has emerged to address this challenge by partitioning users into clusters according to their similarity. However, CFL faces difficulties in training when users are unwilling to share their cluster identities due to privacy concerns. To address these issues, we present an innovative Efficient and Robust Secure Aggregation scheme for CFL, dubbed EBS-CFL. The proposed EBS-CFL supports effectively training CFL while maintaining users’ cluster identity confidentially. Moreover, it detects potential poisonous attacks without compromising individual client gradients by discarding negatively correlated gradients and aggregating positively correlated ones using a weighted approach. The server also authenticates correct gradient encoding by clients. EBS-CFL has high efficiency with client-side overhead O(ml + m^2) for communication and O(m^2l) for computation, where m is the number of cluster identities, and l is the gradient size. When m = 1, EBS-CFL’s computational efficiency of client is at least O(log n) times better than comparison schemes, where n is the number of clients.In addition, we validate the scheme through extensive experiments. Finally, we theoretically prove the scheme’s security.

arxiv情報

著者	Zhiqiang Li,Haiyong Bao,Menghong Guan,Hao Pan,Cheng Huang,Hong-Ning Dai
発行日	2025-06-16 15:39:10+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CR, cs.DC | コメントを受け付けていません

JAEGER: Dual-Level Humanoid Whole-Body Controller

投稿日: 2025年6月17日作成者: jarxiv

要約

このペーパーでは、より堅牢で多用途のポリシーをトレーニングするという課題に対処するヒューマノイドロボットのデュアルレベルの全身コントローラーであるJaegerを紹介します。
従来のシングルコントローラーアプローチとは異なり、Jaegerは上部と下のボディの制御を2つの独立したコントローラーに分離しているため、明確なタスクに集中できるようにします。
この分離は、次元の呪いを軽減し、断層の耐性を改善します。
Jaegerは、根速度追跡（粗粒コントロール）と局所的な関節角追跡（細粒コントロール）の両方をサポートし、汎用性と安定した動きを可能にします。
コントローラーをトレーニングするために、人間のモーションデータセット（AMASS）を利用し、効率的なリターゲティングネットワークを介してヒューマノイドポーズに人間のポーズをリターゲティングし、カリキュラム学習アプローチを採用します。
この方法は、初期化のために監視された学習を実行し、続いてさらなる調査のための強化学習を実行します。
2つのヒューマノイドプラットフォームで実験を行い、シミュレーションと実際の環境の両方で最先端の方法に対するアプローチの優位性を実証します。

要約(オリジナル)

This paper presents JAEGER, a dual-level whole-body controller for humanoid robots that addresses the challenges of training a more robust and versatile policy. Unlike traditional single-controller approaches, JAEGER separates the control of the upper and lower bodies into two independent controllers, so that they can better focus on their distinct tasks. This separation alleviates the dimensionality curse and improves fault tolerance. JAEGER supports both root velocity tracking (coarse-grained control) and local joint angle tracking (fine-grained control), enabling versatile and stable movements. To train the controller, we utilize a human motion dataset (AMASS), retargeting human poses to humanoid poses through an efficient retargeting network, and employ a curriculum learning approach. This method performs supervised learning for initialization, followed by reinforcement learning for further exploration. We conduct our experiments on two humanoid platforms and demonstrate the superiority of our approach against state-of-the-art methods in both simulation and real environments.

arxiv情報

著者	Ziluo Ding,Haobin Jiang,Yuxuan Wang,Zhenguo Sun,Yu Zhang,Xiaojie Niu,Ming Yang,Weishuai Zeng,Xinrun Xu,Zongqing Lu
発行日	2025-06-16 15:42:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.RO | コメントを受け付けていません

Unreal Patterns

投稿日: 2025年6月17日作成者: jarxiv

要約

このペーパーでは、架空のエンティティ、青写真、シミュレーション、将来のシナリオなど、存在しない、または存在しないエンティティに関する情報を表すためのフレームワークを紹介します。
「ダミーインスタンス」を導入したり、モーダルロジックに依存したりする従来のアプローチは批判されており、特定の非存在するトークンではなく、実際のタイプの交差点を使用してそのような場合をモデル化する提案が擁護されています。
この論文は、基本的な正式なオントロジーとその現実主義的なコミットメント内に位置しており、純粋に形而上学的または哲学的な提案よりも実用的で実装可能なソリューションの重要性を強調し、非存在するエンティティへの既存のアプローチが形而上学的仮定に過剰コミットするか、適用を促進する計算的な非効率性を導入すると主張しています。
非現実的なパターンに対する構造化されたオントロジー駆動型アプローチを開発することにより、この論文は、仮説的または非既存のエンティティへの参照を処理する有用で計算可能な実行可能な手段を提供することを目的としています。

要約(オリジナル)

This paper introduces a framework for representing information about entities that do not exist or may never exist, such as those involving fictional entities, blueprints, simulations, and future scenarios. Traditional approaches that introduce ‘dummy instances’ or rely on modal logic are criticized, and a proposal is defended in which such cases are modeled using the intersections of actual types rather than specific non existent tokens. The paper positions itself within the Basic Formal Ontology and its realist commitments, emphasizing the importance of practical, implementable solutions over purely metaphysical or philosophical proposals, arguing that existing approaches to non existent entities either overcommit to metaphysical assumptions or introduce computational inefficiencies that hinder applications. By developing a structured ontology driven approach to unreal patterns, the paper aims to provide a useful and computationally viable means of handling references to hypothetical or non existent entities.

arxiv情報

著者	John Beverley,Jim Logan,Barry Smith
発行日	2025-06-16 15:43:44+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI | コメントを受け付けていません

A Self-Refining Framework for Enhancing ASR Using TTS-Synthesized Data

投稿日: 2025年6月17日作成者: jarxiv

要約

非標識データセットのみでASRパフォーマンスを強化する自己強化フレームワークを提案します。
このプロセスは、発表されていない音声で擬似ラベルを生成する既存のASRモデルから始まり、その後、高忠実度のテキストからスピーチ（TTS）システムをトレーニングするために使用されます。
次に、合成された音声テキストペアが元のASRシステムにブートストラップされ、閉ループの自己改善サイクルが完了します。
私たちは、台湾のマンダリンのスピーチに対するフレームワークの有効性を実証しました。
6,000時間の非標識音声、中程度の量のテキストデータ、AIモデルの合成コンテンツを活用して、Whisper-Large-V2を専門モデルのTwisterに適応させます。
Twisterは、ささやきと比較して、マンダリンでエラー率を最大20％、マンダリンと英語のコードスイッチングベンチマークで50％削減します。
結果は、擬似labりの自己潜水アプローチの説得力のある代替としてのフレームワークを強調し、低リソースまたはドメイン固有の設定でASRパフォーマンスを改善するための実用的な経路を提供します。

要約(オリジナル)

We propose a self-refining framework that enhances ASR performance with only unlabeled datasets. The process starts with an existing ASR model generating pseudo-labels on unannotated speech, which are then used to train a high-fidelity text-to-speech (TTS) system. Then, synthesized speech text pairs are bootstrapped into the original ASR system, completing the closed-loop self-improvement cycle. We demonstrated the effectiveness of the framework on Taiwanese Mandarin speech. Leveraging 6,000 hours of unlabeled speech, a moderate amount of text data, and synthetic content from the AI models, we adapt Whisper-large-v2 into a specialized model, Twister. Twister reduces error rates by up to 20% on Mandarin and 50% on Mandarin-English code-switching benchmarks compared to Whisper. Results highlight the framework as a compelling alternative to pseudo-labeling self-distillation approaches and provides a practical pathway for improving ASR performance in low-resource or domain-specific settings.

arxiv情報

著者	Cheng-Kang Chou,Chan-Jan Hsu,Ho-Lam Chung,Liang-Hsuan Tseng,Hsi-Chun Cheng,Yu-Kuan Fu,Kuan Po Huang,Hung-Yi Lee
発行日	2025-06-16 15:47:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Graph-Convolution-Beta-VAE for Synthetic Abdominal Aorta Aneurysm Generation

投稿日: 2025年6月17日作成者: jarxiv

要約

合成データ生成は、プライバシーの懸念を軽減し、大規模な患者データ分析を可能にすることにより、医学研究において重要な役割を果たします。
この研究では、合成腹部大動脈瘤（AAA）を生成するためのベータ変性自動エンコーダーグラフ畳み込みニューラルネットワークフレームワークを提示します。
小さな現実世界のデータセットを使用して、私たちのアプローチは重要な解剖学的特徴を抽出し、コンパクトな解放された潜在的な潜在空間内で複雑な統計関係をキャプチャします。
データの制限に対処するために、Procrustes分析に基づいた低衝突データ増強が採用され、解剖学的完全性が維持されました。
決定論的で確率的である生成戦略は、リアリズムを確保しながら、データの多様性を強化することができます。
PCAベースのアプローチと比較して、私たちのモデルは、複雑で非線形の解剖学的変動をキャプチャすることにより、目に見えないデータに対してより堅牢に実行されます。
これにより、元のデータセットだけよりも包括的な臨床的および統計的分析が可能になります。
結果として生じる合成AAAデータセットは、患者のプライバシーを維持しながら、医学研究、デバイステスト、および計算モデリングのためのスケーラブルな基盤を提供します。

要約(オリジナル)

Synthetic data generation plays a crucial role in medical research by mitigating privacy concerns and enabling large-scale patient data analysis. This study presents a beta-Variational Autoencoder Graph Convolutional Neural Network framework for generating synthetic Abdominal Aorta Aneurysms (AAA). Using a small real-world dataset, our approach extracts key anatomical features and captures complex statistical relationships within a compact disentangled latent space. To address data limitations, low-impact data augmentation based on Procrustes analysis was employed, preserving anatomical integrity. The generation strategies, both deterministic and stochastic, manage to enhance data diversity while ensuring realism. Compared to PCA-based approaches, our model performs more robustly on unseen data by capturing complex, nonlinear anatomical variations. This enables more comprehensive clinical and statistical analyses than the original dataset alone. The resulting synthetic AAA dataset preserves patient privacy while providing a scalable foundation for medical research, device testing, and computational modeling.

arxiv情報

著者	Francesco Fabbri,Martino Andrea Scarpolini,Angelo Iollo,Francesco Viola,Francesco Tudisco
発行日	2025-06-16 15:55:56+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG, q-bio.TO | コメントを受け付けていません

On the Feasibility of Fully AI-automated Vishing Attacks

投稿日: 2025年6月17日作成者: jarxiv

要約

Vishing Attackは、攻撃者が電話を使用して個人を欺くために電話を使用して、個人データ、財務情報、セキュリティ資格などの機密情報を開示するようにするソーシャルエンジニアリングの一種です。
攻撃者は、銀行や技術サポートなどの正当なエンティティを装うことが多い被害者を操作するために、音声コミュニケーションの緊急性と信頼性を悪用します。
ヴィッシングは、情報を保護するために設計されたセキュリティ制御をバイパスするため、特に深刻な脅威です。
この作業では、AIの出現とともに攻撃をエスカレートする可能性を研究しています。
理論的には、AIを搭載したソフトウェアボットは、電話をかけて潜在的な被害者との会話を開始し、それらを欺き、機密情報を開示するように欺くことにより、これらの攻撃を自動化する能力を持っている可能性があります。
この論文を検証するために、公開されているAIテクノロジーを使用して開発されたAI駆動のヴィッシングシステムであるVikingを紹介します。
大規模な言語モデル（LLM）に依存して、犠牲者との会話を操作するためのコア認知プロセッサとして依存しています。これは、電話での音声テキスト変換を促進する音声からテキストへのパイプラインとテキストからスピーチモジュールのパイプラインによって補完されます。
240人の参加者が関与する制御された社会実験を通じて、バイキングは多くの参加者を説得して、キャンペーンのビッシングのリスクについて明示的に警告されていた人々でさえ、機密情報を明らかにするよう説得したことを発見しました。
バイキングのボットとの相互作用は、一般的に現実的であると考えられていました。
これらの調査結果から、バイキングのようなツールには潜在的な悪意のある俳優がすでにアクセスできると同時に、サイバー認識プログラムの貴重なリソースとしても機能する可能性があると結論付けています。

要約(オリジナル)

A vishing attack is a form of social engineering where attackers use phone calls to deceive individuals into disclosing sensitive information, such as personal data, financial information, or security credentials. Attackers exploit the perceived urgency and authenticity of voice communication to manipulate victims, often posing as legitimate entities like banks or tech support. Vishing is a particularly serious threat as it bypasses security controls designed to protect information. In this work, we study the potential for vishing attacks to escalate with the advent of AI. In theory, AI-powered software bots may have the ability to automate these attacks by initiating conversations with potential victims via phone calls and deceiving them into disclosing sensitive information. To validate this thesis, we introduce ViKing, an AI-powered vishing system developed using publicly available AI technology. It relies on a Large Language Model (LLM) as its core cognitive processor to steer conversations with victims, complemented by a pipeline of speech-to-text and text-to-speech modules that facilitate audio-text conversion in phone calls. Through a controlled social experiment involving 240 participants, we discovered that ViKing has successfully persuaded many participants to reveal sensitive information, even those who had been explicitly warned about the risk of vishing campaigns. Interactions with ViKing’s bots were generally considered realistic. From these findings, we conclude that tools like ViKing may already be accessible to potential malicious actors, while also serving as an invaluable resource for cyber awareness programs.

arxiv情報

著者	João Figueiredo,Afonso Carvalho,Daniel Castro,Daniel Gonçalves,Nuno Santos
発行日	2025-06-16 15:59:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CR, eess.AS | コメントを受け付けていません

On Synthesizing Data for Context Attribution in Question Answering

投稿日: 2025年6月17日作成者: jarxiv

要約

質問回答（QA）は、「野生の」LLM使用のかなりの部分を説明しています。
ただし、LLMは、「幻覚」としても知られる誤ったまたは誤解を招く反応を生成することがあります。
したがって、コンテキストで提供された情報に生成された回答を接地すること – つまり、生成されたテキストの証拠を提供することは、LLMSの信頼性にとって最も重要です。
この情報を提供することは、コンテキストの帰属のタスクです。
このホワイトペーパーでは、このタスクのLLMベースのアプローチ、つまり（i）ゼロショット推論、（ii）LLM Ensembling、および（iii）より大きなLLMによって生成された合成データ上の小さなLMSの微調整を調査します。
私たちの重要な貢献はSynqaです。コンテキストの帰属データを合成するための新しい生成戦略です。
選択されたコンテキスト文を考えると、LLMはこれらの文によってサポートされるQAペアを生成します。
これにより、テキスト生成におけるLLMSの自然な強みが、合成トレーニングデータの明確な帰属パスを確保します。
SYNQAを介して合成された属性データは、異なるQAタスクとドメインのコンテキスト属性のために小さなLMSを微調整するのに非常に効果的であることを示します。
最後に、ユーザー調査により、QAのコンテキスト属性における小さなLMS（SYNQAの合成データで微調整）の有用性を検証します。

要約(オリジナル)

Question Answering (QA) accounts for a significant portion of LLM usage ‘in the wild’. However, LLMs sometimes produce false or misleading responses, also known as ‘hallucinations’. Therefore, grounding the generated answers in contextually provided information — i.e., providing evidence for the generated text — is paramount for LLMs’ trustworthiness. Providing this information is the task of context attribution. In this paper, we systematically study LLM-based approaches for this task, namely we investigate (i) zero-shot inference, (ii) LLM ensembling, and (iii) fine-tuning of small LMs on synthetic data generated by larger LLMs. Our key contribution is SynQA: a novel generative strategy for synthesizing context attribution data. Given selected context sentences, an LLM generates QA pairs that are supported by these sentences. This leverages LLMs’ natural strengths in text generation while ensuring clear attribution paths in the synthetic training data. We show that the attribution data synthesized via SynQA is highly effective for fine-tuning small LMs for context attribution in different QA tasks and domains. Finally, with a user study, we validate the usefulness of small LMs (fine-tuned on synthetic data from SynQA) in context attribution for QA.

arxiv情報

著者	Gorjan Radevski,Kiril Gashteovski,Shahbaz Syed,Christopher Malon,Sebastien Nicolas,Chia-Chien Hung,Timo Sztyler,Verena Heußer,Wiem Ben Rim,Masafumi Enomoto,Kunihiro Takeoka,Masafumi Oyamada,Goran Glavaš,Carolin Lawrence
発行日	2025-06-16 16:22:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.IR, cs.LG | コメントを受け付けていません

We Should Identify and Mitigate Third-Party Safety Risks in MCP-Powered Agent Systems

投稿日: 2025年6月17日作成者: jarxiv

要約

大規模な言語モデル（LLMS）の開発は、補強学習とツール使用エージェントを介した環境フィードバック駆動型学習の出現によってフラグを立てられた経験駆動型の時代に入りました。
これにより、モデルコンテキストプロトコル（MCP）の新興は、LLMが\ APIやデータなどの外部サービスとどのように相互作用する必要があるかについての標準を定義します。
ただし、MCPがLLMエージェントシステムの事実上の標準になると、新しい安全リスクも導入します。
特に、MCPは、LLM開発者によって制御されていないサードパーティサービスをエージェントシステムに導入します。
これらのサードパーティMCPサービスプロバイダーは潜在的に悪意があり、脆弱性と妨害ユーザーエージェントのやり取りを活用するための経済的インセンティブを持っています。
このポジションペーパーでは、LLM Safetyの研究コミュニティを提唱して、MCPによって導入された新しい安全リスクの問題に細心の注意を払い、安全なMCPを搭載したエージェントシステムを構築するための新しい技術を開発します。
私たちの立場を確立するために、私たちは3つの重要な部分で議論します。
（1）最初に、MCP駆動エージェントシステムの安全性の問題を調べるための制御されたフレームワークである\ Frameworkを構築します。
（2）次に、一連のパイロット実験を実施して、MCP駆動エージェントシステムの安全リスクを実証することは本当の脅威であり、その防御は些細なことではありません。
（3）最後に、安全なMCP駆動エージェントシステムを構築するためのロードマップを表示することにより、見通しを示します。
特に、研究者に次の研究の方向性を説得するよう呼びかけます：レッドチーム、MCPセーフLLM開発、MCP安全性評価、MCP安全データの蓄積、MCPサービスセーフガード、MCP安全な生態系の建設。
このポジションペーパーが、MCPの安全性における研究コミュニティの認識を高め、より多くの研究者がこの重要な研究の方向性に参加することを奨励できることを願っています。
私たちのコードは、https：//github.com/littlelitlenine/safemcp.gitで入手できます。

要約(オリジナル)

The development of large language models (LLMs) has entered in a experience-driven era, flagged by the emergence of environment feedback-driven learning via reinforcement learning and tool-using agents. This encourages the emergenece of model context protocol (MCP), which defines the standard on how should a LLM interact with external services, such as \api and data. However, as MCP becomes the de facto standard for LLM agent systems, it also introduces new safety risks. In particular, MCP introduces third-party services, which are not controlled by the LLM developers, into the agent systems. These third-party MCP services provider are potentially malicious and have the economic incentives to exploit vulnerabilities and sabotage user-agent interactions. In this position paper, we advocate the research community in LLM safety to pay close attention to the new safety risks issues introduced by MCP, and develop new techniques to build safe MCP-powered agent systems. To establish our position, we argue with three key parts. (1) We first construct \framework, a controlled framework to examine safety issues in MCP-powered agent systems. (2) We then conduct a series of pilot experiments to demonstrate the safety risks in MCP-powered agent systems is a real threat and its defense is not trivial. (3) Finally, we give our outlook by showing a roadmap to build safe MCP-powered agent systems. In particular, we would call for researchers to persue the following research directions: red teaming, MCP safe LLM development, MCP safety evaluation, MCP safety data accumulation, MCP service safeguard, and MCP safe ecosystem construction. We hope this position paper can raise the awareness of the research community in MCP safety and encourage more researchers to join this important research direction. Our code is available at https://github.com/littlelittlenine/SafeMCP.git.

arxiv情報

著者	Junfeng Fang,Zijun Yao,Ruipeng Wang,Haokai Ma,Xiang Wang,Tat-Seng Chua
発行日	2025-06-16 16:24:31+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Improving Clinical Note Generation from Complex Doctor-Patient Conversation

投稿日: 2025年6月17日作成者: jarxiv

要約

臨床メモを書いて、健康診断の文書化は、医療専門家にとって重要なタスクであり、患者ケアの文書の重要な要素として機能します。
ただし、これらのメモを手動で書くことは時間がかかり、臨床医が直接的な患者の相互作用やその他のタスクに費やすことができる時間に影響を与える可能性があります。
その結果、自動化された臨床ノート生成システムの開発は、健康のためのAI内の臨床的に意味のある研究分野として浮上しています。
この論文では、大規模な言語モデル（LLMS）を使用して、臨床ノート生成の分野に3つの重要な貢献をします。
まず、Cliniknoteを紹介します。Cliniknoteは、完全な臨床ノートと組み合わせた1,200の複雑な医師と患者の会話で構成される包括的なデータセットです。
現代のニューラルネットワークの助けを借りて医療専門家によって作成およびキュレーションされたこのデータセットは、臨床ノート生成タスクでモデルをトレーニングおよび評価するための貴重なリソースを提供します。
第二に、従来の石鹸〜\ cite {podder2023soap}（主観的、客観的、評価、計画）のメモを上部に追加して、本質的な情報をすばやく識別できるようにするk-soap（キーワード、主観的、客観的、評価、および計画）のノート形式を提案します。
第三に、自動パイプラインを開発して、医師と患者の会話からK-SOAPノートを生成し、さまざまなメトリックを使用してさまざまな最新のLLMをベンチマークします。
我々の結果は、標準のLLM Finetuningメソッドと比較して、効率とパフォーマンスの大幅な改善を示しています。

要約(オリジナル)

Writing clinical notes and documenting medical exams is a critical task for healthcare professionals, serving as a vital component of patient care documentation. However, manually writing these notes is time-consuming and can impact the amount of time clinicians can spend on direct patient interaction and other tasks. Consequently, the development of automated clinical note generation systems has emerged as a clinically meaningful area of research within AI for health. In this paper, we present three key contributions to the field of clinical note generation using large language models (LLMs). First, we introduce CliniKnote, a comprehensive dataset consisting of 1,200 complex doctor-patient conversations paired with their full clinical notes. This dataset, created and curated by medical experts with the help of modern neural networks, provides a valuable resource for training and evaluating models in clinical note generation tasks. Second, we propose the K-SOAP (Keyword, Subjective, Objective, Assessment, and Plan) note format, which enhances traditional SOAP~\cite{podder2023soap} (Subjective, Objective, Assessment, and Plan) notes by adding a keyword section at the top, allowing for quick identification of essential information. Third, we develop an automatic pipeline to generate K-SOAP notes from doctor-patient conversations and benchmark various modern LLMs using various metrics. Our results demonstrate significant improvements in efficiency and performance compared to standard LLM finetuning methods.

arxiv情報

著者	Yizhan Li,Sifan Wu,Christopher Smith,Thomas Lo,Bang Liu
発行日	2025-06-16 16:24:42+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント