jarxiv | Japanese arxiv | ページ 141

Neural Tangent Kernel Analysis to Probe Convergence in Physics-informed Neural Solvers: PIKANs vs. PINNs

投稿日: 2025年6月10日作成者: jarxiv

要約

物理学に基づいたコルモゴロフ・アーノルドネットワーク（ピカン）、特にチェビシェフベースのバリアント（CPIKANS）は、最近、部分微分方程式（PDE）を解くための有望なモデルとして浮上しています。
しかし、彼らのトレーニングのダイナミクスと収束行動は、理論的にも数値的にもほとんど未踏のままです。
この作業では、神経接線カーネル（NTK）理論を使用してCpikansを分析することにより、Cpikansの理論的理解を進めることを目指しています。
私たちの目的は、グラデーションベースのトレーニング全体のカーネル構造の進化と、その後の学習効率への影響を識別することです。
まず、標準的なCKANのNTKを監視付きの設定で導き出し、次に分析を物理学に基づいたコンテキストに拡張します。
4つの代表的なPDEのNTKマトリックス、特に固有値分布とスペクトルバイアスのスペクトル特性を分析します。
また、NTKの進化と結果として生じる学習ダイナミクスについて、さまざまな最適化戦略、例えば、一次、二次的、ハイブリッドアプローチの影響について調査を実施します。
結果は、CPIKANSの文脈におけるNTKの扱いやすい行動を示しています。これは、標準的な物理学に基づいたニューラルネットワーク（PINN）がキャプチャできない学習ダイナミクスを明らかにします。
スペクトルの傾向は、ドメイン分解がトレーニングを改善するときにも明らかになり、カーネルの動作をさまざまなセットアップの下で収束速度に直接リンクします。
私たちの知る限り、これはCpikansの最初の体系的なNTK研究であり、経験的パフォーマンスを明確にし、予測する理論的洞察を提供します。

要約(オリジナル)

Physics-informed Kolmogorov-Arnold Networks (PIKANs), and in particular their Chebyshev-based variants (cPIKANs), have recently emerged as promising models for solving partial differential equations (PDEs). However, their training dynamics and convergence behavior remain largely unexplored both theoretically and numerically. In this work, we aim to advance the theoretical understanding of cPIKANs by analyzing them using Neural Tangent Kernel (NTK) theory. Our objective is to discern the evolution of kernel structure throughout gradient-based training and its subsequent impact on learning efficiency. We first derive the NTK of standard cKANs in a supervised setting, and then extend the analysis to the physics-informed context. We analyze the spectral properties of NTK matrices, specifically their eigenvalue distributions and spectral bias, for four representative PDEs: the steady-state Helmholtz equation, transient diffusion and Allen-Cahn equations, and forced vibrations governed by the Euler-Bernoulli beam equation. We also conduct an investigation into the impact of various optimization strategies, e.g., first-order, second-order, and hybrid approaches, on the evolution of the NTK and the resulting learning dynamics. Results indicate a tractable behavior for NTK in the context of cPIKANs, which exposes learning dynamics that standard physics-informed neural networks (PINNs) cannot capture. Spectral trends also reveal when domain decomposition improves training, directly linking kernel behavior to convergence rates under different setups. To the best of our knowledge, this is the first systematic NTK study of cPIKANs, providing theoretical insight that clarifies and predicts their empirical performance.

arxiv情報

著者	Salah A. Faroughi,Farinaz Mostajeran
発行日	2025-06-09 17:30:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, math-ph, math.AP, math.MP, math.SP | コメントを受け付けていません

A Two-Phase Deep Learning Framework for Adaptive Time-Stepping in High-Speed Flow Modeling

投稿日: 2025年6月10日作成者: jarxiv

要約

機械学習方法を使用して高速フローをモデル化する問題を検討します。
ほとんどの以前の研究は、均一な時間ステップが実用的である低速流体の流れに焦点を当てていますが、音の速度を超える流れは、衝撃波などの突然の変化を示します。
そのような場合、適応時間ステップ方法を使用して、これらの現象を解決しながら、同時に計算コストのバランスをとるのに十分な時間分解能を可能にすることが不可欠です。
ここでは、Adaptive Time-Stepingで高速フローをモデル化するために、Shockcastとして知られる2フェーズの機械学習方法を提案します。
最初のフェーズでは、タイムステップサイズを予測するために機械学習モデルを使用することを提案します。
第2フェーズでは、予測されたタイムステップは、予測されたタイムステップによってシステム状態を前進させるために、現在の流体場とともに入力として使用されます。
Timestep予測のためにいくつかの物理的に動機付けられたコンポーネントを探索し、神経オードと専門家の混合に触発されたタイムステップコンディショニング戦略を導入します。
Shockcastは高速フローを学習するための最初のフレームワークであるため、https://huggingface.co/datasets/divelabで入手可能な2つの超音波フローデータセットを生成して、方法を評価します。
私たちのコードは、Airs Library（https://github.com/divelab/airs）の一部として公開されています。

要約(オリジナル)

We consider the problem of modeling high-speed flows using machine learning methods. While most prior studies focus on low-speed fluid flows in which uniform time-stepping is practical, flows approaching and exceeding the speed of sound exhibit sudden changes such as shock waves. In such cases, it is essential to use adaptive time-stepping methods to allow a temporal resolution sufficient to resolve these phenomena while simultaneously balancing computational costs. Here, we propose a two-phase machine learning method, known as ShockCast, to model high-speed flows with adaptive time-stepping. In the first phase, we propose to employ a machine learning model to predict the timestep size. In the second phase, the predicted timestep is used as an input along with the current fluid fields to advance the system state by the predicted timestep. We explore several physically-motivated components for timestep prediction and introduce timestep conditioning strategies inspired by neural ODE and Mixture of Experts. As ShockCast is the first framework for learning high-speed flows, we evaluate our methods by generating two supersonic flow datasets, available at https://huggingface.co/datasets/divelab. Our code is publicly available as part of the AIRS library (https://github.com/divelab/AIRS).

arxiv情報

著者	Jacob Helwig,Sai Sreeharsha Adavi,Xuan Zhang,Yuchao Lin,Felix S. Chim,Luke Takeshi Vizzini,Haiyang Yu,Muhammad Hasnain,Saykat Kumar Biswas,John J. Holloway,Narendra Singh,N. K. Anand,Swagnik Guhathakurta,Shuiwang Ji
発行日	2025-06-09 17:44:20+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, physics.flu-dyn | コメントを受け付けていません

Hyperpruning: Efficient Search through Pruned Variants of Recurrent Neural Networks Leveraging Lyapunov Spectrum

投稿日: 2025年6月10日作成者: jarxiv

要約

消費電力と貯蔵の利用に関して効率を改善するために、過剰なパラメータ化された再発性ニューラルネットワークのために、さまざまな剪定方法が導入されています。
これらの進歩は、「Hyperpruning」と呼ばれる新しいパラダイムを動機付け、特定のネットワークアーキテクチャとアプリケーションに最も適した剪定戦略を特定しようとしています。
最適な構成の精度が不確実なままである従来のハイパーパラメーター検索とは異なり、ネットワークの剪定のコンテキストでは、高密度モデルの精度が剪定されたものの精度のターゲットを設定します。
したがって、目標は、この確立された精度に一致するか、それを上回る剪定されたバリアントを発見することです。
ただし、プルーニング構成をめぐる徹底的な検索は計算的に高価であり、早期のパフォーマンス保証がありません。
この課題に対処するために、剪定されたネットワークと密なネットワークを早期に比較できる新しいリアプノフスペクトル（LS）ベースの距離メトリックを提案し、トレーニング後のパフォーマンスの正確な予測を可能にします。
このLSベースの距離を標準のハイパーパラメーター最適化アルゴリズムと統合することにより、LSベースのHyperPruning（LSH）と呼ばれる効率的なハイパープルーニングフレームワークを導入します。
LSHは、完全なトレーニングに依存する従来のアプローチと比較して、検索時間を数桁短縮します。
Penn TreeBankデータセットを使用した積み重ねられたLSTMおよびRHNアーキテクチャの実験、およびWikitext-2を使用したAWD-LSTM-MOSでは、固定トレーニング予算とターゲットプルーニング比の下で、LSHは一貫して優れた剪定モデルを識別することを示しています。
驚くべきことに、これらの剪定されたバリアントは、損失ベースのベースラインによって選択されたものよりも優れたバリアントを上回るだけでなく、密集したカウンターパートのパフォーマンスを超えています。

要約(オリジナル)

A variety of pruning methods have been introduced for over-parameterized Recurrent Neural Networks to improve efficiency in terms of power consumption and storage utilization. These advances motivate a new paradigm, termed `hyperpruning’, which seeks to identify the most suitable pruning strategy for a given network architecture and application. Unlike conventional hyperparameter search, where the optimal configuration’s accuracy remains uncertain, in the context of network pruning, the accuracy of the dense model sets the target for the accuracy of the pruned one. The goal, therefore, is to discover pruned variants that match or even surpass this established accuracy. However, exhaustive search over pruning configurations is computationally expensive and lacks early performance guarantees. To address this challenge, we propose a novel Lyapunov Spectrum (LS)-based distance metric that enables early comparison between pruned and dense networks, allowing accurate prediction of post-training performance. By integrating this LS-based distance with standard hyperparameter optimization algorithms, we introduce an efficient hyperpruning framework, termed LS-based Hyperpruning (LSH). LSH reduces search time by an order of magnitude compared to conventional approaches relying on full training. Experiments on stacked LSTM and RHN architectures using the Penn Treebank dataset, and on AWD-LSTM-MoS using WikiText-2, demonstrate that under fixed training budgets and target pruning ratios, LSH consistently identifies superior pruned models. Remarkably, these pruned variants not only outperform those selected by loss-based baseline but also exceed the performance of their dense counterpart.

arxiv情報

著者	Caleb Zheng,Eli Shlizerman
発行日	2025-06-09 17:49:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Realistic Urban Traffic Generator using Decentralized Federated Learning for the SUMO simulator

投稿日: 2025年6月10日作成者: jarxiv

要約

現実的な都市交通シミュレーションは、持続可能な都市計画とインテリジェント輸送システムの開発に不可欠です。
ただし、特に大規模なシナリオでは、現実世界の条件を正確に反映する高忠実度の時間変化するトラフィックプロファイルを生成することは、依然として大きな課題です。
既存の方法は、集中化されたデータ処理により、正確性、スケーラビリティ、またはプライバシーの懸念を引き起こす制限に苦しむことがよくあります。
この作業では、ディープ補強学習（DRL）エージェントをSUMOシミュレーターと統合して現実的な24時間トラフィックパターンを生成する新しいフレームワークであるDesrutge（分散型現実的な都市交通ジェネレーター）を紹介します。
Desrutgeの主要な革新は、分散型連合学習（DFL）の使用であり、各トラフィック検出器とそれに対応する都市帯機能が独立した学習ノードとして機能します。
これらのノードは、最小限の履歴データを使用してローカルDRLモデルをトレーニングし、中央のコーディネーターを必要とせずに、選択したピア（例えば、地理的に隣接するゾーンなど）とモデルパラメーターを交換することにより、パフォーマンスを協力します。
バルセロナ市の実際のデータを使用して評価されたDesrutgeは、より正確でプライバシーを提供するトラフィックパターンの生成を提供することにより、RoutesAmplerなどの標準的なSUMOベースのツールやその他の集中学習アプローチを上回ります。

要約(オリジナル)

Realistic urban traffic simulation is essential for sustainable urban planning and the development of intelligent transportation systems. However, generating high-fidelity, time-varying traffic profiles that accurately reflect real-world conditions, especially in large-scale scenarios, remains a major challenge. Existing methods often suffer from limitations in accuracy, scalability, or raise privacy concerns due to centralized data processing. This work introduces DesRUTGe (Decentralized Realistic Urban Traffic Generator), a novel framework that integrates Deep Reinforcement Learning (DRL) agents with the SUMO simulator to generate realistic 24-hour traffic patterns. A key innovation of DesRUTGe is its use of Decentralized Federated Learning (DFL), wherein each traffic detector and its corresponding urban zone function as an independent learning node. These nodes train local DRL models using minimal historical data and collaboratively refine their performance by exchanging model parameters with selected peers (e.g., geographically adjacent zones), without requiring a central coordinator. Evaluated using real-world data from the city of Barcelona, DesRUTGe outperforms standard SUMO-based tools such as RouteSampler, as well as other centralized learning approaches, by delivering more accurate and privacy-preserving traffic pattern generation.

arxiv情報

著者	Alberto Bazán-Guillén,Carlos Beis-Penedo,Diego Cajaraville-Aboy,Pablo Barbecho-Bautista,Rebeca P. Díaz-Redondo,Luis J. de la Cruz Llopis,Ana Fernández-Vilas,Mónica Aguilar Igartua,Manuel Fernández-Veiga
発行日	2025-06-09 17:51:45+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Training Superior Sparse Autoencoders for Instruct Models

投稿日: 2025年6月10日作成者: jarxiv

要約

大規模な言語モデル（LLM）がスケールと能力が成長するにつれて、それらの内部メカニズムを理解することはますます重要になります。
スパース自動エンコーダー（SAE）は、機械的解釈性の重要なツールとして登場し、LLMSから人間の解釈可能な特徴を抽出できるようになりました。
ただし、既存のSAEトレーニング方法は主にベースモデル向けに設計されているため、モデルを指示するために適用されると再構成の品質と解釈可能性が低下します。
このギャップを埋めるために、$ \ underline {\ textbf {f}} $ inetuning-$ \ underline {\ textbf {a}} $ ligned $ \ underline {\ textbf {s}} $ equential $ \ underline {\ textbf {t} {t} $ wast \ $ sept（$ a）
指示モデル専用に調整されています。
$ \ textIT {fast} $は、トレーニングプロセスをデータの分布モデルに特徴的なデータ分布パターンとアクティベーションパターンに合わせて、再構築と機能の解釈可能性の両方で大幅な改善をもたらします。
QWEN2.5-7B-Instructでは、$ \ textIT {fast} $は、トークン再構築で0.6468の平均2乗エラーを達成し、5.1985および1.5096のエラーを伴うベースラインメソッドを大幅に上回ります。
機能の解釈可能性では、$ \ textit {fast} $は、llama3.2-3b-instructの高品質の機能のより高い割合を生み出します。
驚くべきことに、SAEを介した特別なトークンの活性化に介入すると、出力品質の改善につながり、モデルの動作を細かく制御するための新しい機会が示唆されることがわかります。
コード、データ、および240の訓練されたSAEは、https：//github.com/geaming2002/fastで入手できます。

要約(オリジナル)

As large language models (LLMs) grow in scale and capability, understanding their internal mechanisms becomes increasingly critical. Sparse autoencoders (SAEs) have emerged as a key tool in mechanistic interpretability, enabling the extraction of human-interpretable features from LLMs. However, existing SAE training methods are primarily designed for base models, resulting in reduced reconstruction quality and interpretability when applied to instruct models. To bridge this gap, we propose $\underline{\textbf{F}}$inetuning-$\underline{\textbf{a}}$ligned $\underline{\textbf{S}}$equential $\underline{\textbf{T}}$raining ($\textit{FAST}$), a novel training method specifically tailored for instruct models. $\textit{FAST}$ aligns the training process with the data distribution and activation patterns characteristic of instruct models, resulting in substantial improvements in both reconstruction and feature interpretability. On Qwen2.5-7B-Instruct, $\textit{FAST}$ achieves a mean squared error of 0.6468 in token reconstruction, significantly outperforming baseline methods with errors of 5.1985 and 1.5096. In feature interpretability, $\textit{FAST}$ yields a higher proportion of high-quality features, for Llama3.2-3B-Instruct, $21.1\%$ scored in the top range, compared to $7.0\%$ and $10.2\%$ for $\textit{BT(P)}$ and $\textit{BT(F)}$. Surprisingly, we discover that intervening on the activations of special tokens via the SAEs leads to improvements in output quality, suggesting new opportunities for fine-grained control of model behavior. Code, data, and 240 trained SAEs are available at https://github.com/Geaming2002/FAST.

arxiv情報

著者	Jiaming Li,Haoran Ye,Yukun Chen,Xinyue Li,Lei Zhang,Hamid Alinejad-Rokny,Jimmy Chih-Hsien Peng,Min Yang
発行日	2025-06-09 12:23:34+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

EVADE: Multimodal Benchmark for Evasive Content Detection in E-Commerce Applications

投稿日: 2025年6月10日作成者: jarxiv

要約

eコマースプラットフォームは、違法または誤解を招く製品コンテンツを検出するために、大規模な言語モデル（LLMS）およびビジョン言語モデル（VLM）にますます依存しています。
ただし、これらのモデルは回避的なコンテンツに対して脆弱なままです。入力（テキストまたは画像）は、表面的にプラットフォームポリシーに準拠し、禁止された主張を密かに伝えます。
明白な障害を引き起こす伝統的な敵対的な攻撃とは異なり、回避的なコンテンツは曖昧さと文脈を活用し、検出がはるかに難しくなります。
既存の堅牢性ベンチマークは、この要求の厳しい現実世界の課題に対するガイダンスをほとんど提供しません。
eコマースでの回避コンテンツ検出に関する基礎モデルを評価するために特別に設計された最初の専門家である中国語のマルチモーダルベンチマークであるEvadeを紹介します。
データセットには、2,833の注釈付きテキストサンプルと、ボディーシェーピング、高さの成長、健康サプリメントなど、6つの厳しい製品カテゴリにまたがる13,961の画像が含まれています。
2つの補完的なタスクは、明確な能力を評価します。単一溶解は、短いプロンプトの下で細粒の推論を調査し、オールインワンは、重複するポリシールールを統一された命令に統合することにより、長いコンテキスト推論をテストします。
特に、オールインワンの設定は、部分的な精度とフルマッチの精度の間のパフォーマンスギャップを大幅に狭め、より明確なルール定義が人間とモデルの判断とモデルの判断の調整を改善することを示唆しています。
26の主流LLMとVLMSをベンチマークし、実質的なパフォーマンスギャップを観察します。最先端のモデルでさえ、回避サンプルを頻繁に誤分類します。
回避と強力なベースラインを解放することにより、回避コンテンツ検出を評価するための最初の厳密な基準を提供し、現在のマルチモーダル推論の基本的な制限を公開し、eコマースのより安全で透明なコンテンツモデレーションシステムの基礎を築きます。
データセットは、https://huggingface.co/datasets/koenshen/evade-benchで公開されています。

要約(オリジナル)

E-commerce platforms increasingly rely on Large Language Models (LLMs) and Vision-Language Models (VLMs) to detect illicit or misleading product content. However, these models remain vulnerable to evasive content: inputs (text or images) that superficially comply with platform policies while covertly conveying prohibited claims. Unlike traditional adversarial attacks that induce overt failures, evasive content exploits ambiguity and context, making it far harder to detect. Existing robustness benchmarks provide little guidance for this demanding, real-world challenge. We introduce EVADE, the first expert-curated, Chinese, multimodal benchmark specifically designed to evaluate foundation models on evasive content detection in e-commerce. The dataset contains 2,833 annotated text samples and 13,961 images spanning six demanding product categories, including body shaping, height growth, and health supplements. Two complementary tasks assess distinct capabilities: Single-Violation, which probes fine-grained reasoning under short prompts, and All-in-One, which tests long-context reasoning by merging overlapping policy rules into unified instructions. Notably, the All-in-One setting significantly narrows the performance gap between partial and full-match accuracy, suggesting that clearer rule definitions improve alignment between human and model judgment. We benchmark 26 mainstream LLMs and VLMs and observe substantial performance gaps: even state-of-the-art models frequently misclassify evasive samples. By releasing EVADE and strong baselines, we provide the first rigorous standard for evaluating evasive-content detection, expose fundamental limitations in current multimodal reasoning, and lay the groundwork for safer and more transparent content moderation systems in e-commerce. The dataset is publicly available at https://huggingface.co/datasets/koenshen/EVADE-Bench.

arxiv情報

著者	Ancheng Xu,Zhihao Yang,Jingpeng Li,Guanghu Yuan,Longze Chen,Liang Yan,Jiehui Zhou,Zhen Qin,Hengyun Chang,Hamid Alinejad-Rokny,Bo Zheng,Min Yang
発行日	2025-06-09 12:54:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Through the Valley: Path to Effective Long CoT Training for Small Language Models

投稿日: 2025年6月10日作成者: jarxiv

要約

長い考え方（COT）の監督は、言語モデルの推論を強化するための一般的な戦略となっています。
大規模なモデルには効果的ですが、限られた長いCOTデータを訓練した小さな言語モデル（SLM; <= 3Bパラメーター）を経験する長いCOT劣化と呼ばれる現象を特定します。 QWEN2.5、LLAMA3、およびGEMMA3ファミリーに関する広範な実験を通じて、この劣化がSLM全体で広まっていることを実証します。一部の設定では、微調整前に8kの長さのコットの例でトレーニングされたモデルは、元のパフォーマンスの最大75％を失います。驚くべきことに、いくつかの特に小さなモデルでは、220Kの長いCOTの例でトレーニングでさえ、微調整前に元のパフォーマンスを回復または上回ることができないことを観察します。私たちの分析は、この効果をエラーの蓄積に帰します。応答が長くなると、マルチステップ推論の能力が向上しますが、間違いを悪化させるリスクも増幅します。さらに、長いCOTの分解は、下流の補強学習（RL）に悪影響を与える可能性があることがわかりますが、これは十分にスケーリングされた監視された微調整（SFT）によって緩和される可能性があります。私たちの調査結果は、SLMSの長いCOTトレーニングの利点に関する一般的な仮定に挑戦し、より効果的な小規模推論モデルを構築するための実用的なガイダンスを提供します。

要約(オリジナル)

Long chain-of-thought (CoT) supervision has become a common strategy to enhance reasoning in language models. While effective for large models, we identify a phenomenon we call Long CoT Degradation, in which small language models (SLMs; <=3B parameters) trained on limited long CoT data experience significant performance deterioration. Through extensive experiments on the Qwen2.5, LLaMA3 and Gemma3 families, we demonstrate that this degradation is widespread across SLMs. In some settings, models trained on only 8k long CoT examples lose up to 75% of their original performance before fine-tuning. Strikingly, we further observe that for some particularly small models, even training on 220k long CoT examples fails to recover or surpass their original performance prior to fine-tuning. Our analysis attributes this effect to error accumulation: while longer responses increase the capacity for multi-step reasoning, they also amplify the risk of compounding mistakes. Furthermore, we find that Long CoT Degradation may negatively impacts downstream reinforcement learning (RL), although this can be alleviated by sufficiently scaled supervised fine-tuning (SFT). Our findings challenge common assumptions about the benefits of long CoT training for SLMs and offer practical guidance for building more effective small-scale reasoning models.

arxiv情報

著者	Renjie Luo,Jiaxi Li,Chen Huang,Wei Lu
発行日	2025-06-09 12:56:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.CL | コメントを受け付けていません

Representation Bending for Large Language Model Safety

投稿日: 2025年6月10日作成者: jarxiv

要約

大規模な言語モデル（LLM）は強力なツールとして浮上していますが、有害なコンテンツ生成からより広範な社会的危害に至るまでの固有の安全リスクは、大きな課題をもたらします。
これらのリスクは、最近の敵対的な攻撃、微調整の脆弱性、およびハイステークス環境でのLLMの展開の増加によって増幅される可能性があります。
人間のフィードバックや敵対的なトレーニングで微調整するなどの既存の安全性向上技術は、特定の脅威に対処し、目に見えない攻撃間で一般化することができない、または手動システムレベルの防御を必要とすることが多いため、依然として脆弱です。
このペーパーでは、LLMの有害行動の根底にある表現を根本的に混乱させる新しいアプローチであるRepbendを紹介し、（潜在的に固有の）安全性を高めるためのスケーラブルなソリューションを提供します。
repbendは、活性化ステアリングのアイデアをもたらします – 推論中のステアリングモデルの動作のための単純なベクター算術 – 損失ベースの微調整にもたらされます。
Repbendは、広範な評価を通じて、最先端のパフォーマンスを達成し、回路ブレーカー、RMU、NPOなどの以前の方法を上回り、多様なジェイルブレイクベンチマーク全体で攻撃成功率を最大95％削減し、すべてモデルの使いやすさと一般的な機能を軽減します。

要約(オリジナル)

Large Language Models (LLMs) have emerged as powerful tools, but their inherent safety risks – ranging from harmful content generation to broader societal harms – pose significant challenges. These risks can be amplified by the recent adversarial attacks, fine-tuning vulnerabilities, and the increasing deployment of LLMs in high-stakes environments. Existing safety-enhancing techniques, such as fine-tuning with human feedback or adversarial training, are still vulnerable as they address specific threats and often fail to generalize across unseen attacks, or require manual system-level defenses. This paper introduces RepBend, a novel approach that fundamentally disrupts the representations underlying harmful behaviors in LLMs, offering a scalable solution to enhance (potentially inherent) safety. RepBend brings the idea of activation steering – simple vector arithmetic for steering model’s behavior during inference – to loss-based fine-tuning. Through extensive evaluation, RepBend achieves state-of-the-art performance, outperforming prior methods such as Circuit Breaker, RMU, and NPO, with up to 95% reduction in attack success rates across diverse jailbreak benchmarks, all with negligible reduction in model usability and general capabilities.

arxiv情報

著者	Ashkan Yousefpour,Taeheon Kim,Ryan S. Kwon,Seungbeen Lee,Wonje Jeung,Seungju Han,Alvin Wan,Harrison Ngan,Youngjae Yu,Jonghyun Choi
発行日	2025-06-09 12:56:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.CL, cs.CR, cs.LG | コメントを受け付けていません

Multilingual Grammatical Error Annotation: Combining Language-Agnostic Framework with Language-Specific Flexibility

投稿日: 2025年6月10日作成者: jarxiv

要約

文法エラー補正（GEC）は、正確なエラーアノテーションと評価に依存していますが、$ \ texttt {errant} $、類型的に多様な言語に拡張した場合の顔の制限などの既存のフレームワークに依存しています。
この論文では、多言語の文法エラーアノテーションのための標準化されたモジュール式フレームワークを紹介します。
私たちのアプローチは、言語に依存しない基盤と構造化された言語固有の拡張機能を組み合わせて、言語間で一貫性と柔軟性の両方を可能にします。
$ \ texttt {errant} $を$ \ texttt {stanza} $を使用して再実装して、より広範な多言語カバレッジをサポートし、一般的な消費からよりカスタマイズされた言語学的拒否に至るまで、英語、ドイツ語、チェコ語、韓国語、中国語へのアプリケーションを通じてフレームワークの適応性を実証します。
この作業は、言語間でスケーラブルで解釈可能なGEC注釈をサポートし、多言語設定でより一貫した評価を促進します。
完全なコードベースおよび注釈ツールには、https：//github.com/open-writing-evaluation/jp_errant_beaでアクセスできます。

要約(オリジナル)

Grammatical Error Correction (GEC) relies on accurate error annotation and evaluation, yet existing frameworks, such as $\texttt{errant}$, face limitations when extended to typologically diverse languages. In this paper, we introduce a standardized, modular framework for multilingual grammatical error annotation. Our approach combines a language-agnostic foundation with structured language-specific extensions, enabling both consistency and flexibility across languages. We reimplement $\texttt{errant}$ using $\texttt{stanza}$ to support broader multilingual coverage, and demonstrate the framework’s adaptability through applications to English, German, Czech, Korean, and Chinese, ranging from general-purpose annotation to more customized linguistic refinements. This work supports scalable and interpretable GEC annotation across languages and promotes more consistent evaluation in multilingual settings. The complete codebase and annotation tools can be accessed at https://github.com/open-writing-evaluation/jp_errant_bea.

arxiv情報

著者	Mengyang Qiu,Tran Minh Nguyen,Zihao Huang,Zelong Li,Yang Gu,Qingyu Gao,Siliang Liu,Jungyeul Park
発行日	2025-06-09 13:01:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.CL | コメントを受け付けていません

Evaluating Zero-Shot Multilingual Aspect-Based Sentiment Analysis with Large Language Models

投稿日: 2025年6月10日作成者: jarxiv

要約

シーケンスラベル付けタスクであるアスペクトベースのセンチメント分析（ABSA）は、多言語のコンテキストで注目を集めています。
以前の研究では、主にABSA専用に微調整またはトレーニングモデルに焦点を当てていましたが、ゼロショット条件下で大規模な言語モデル（LLM）を評価して、タスク固有の最小限の適応でこの課題に取り組む可能性を調査します。
多言語のアブサタスクに関する一連のLLMの包括的な経験的評価を実施し、9つの異なるモデルにわたって、バニラゼロショット、考え方（COT）、自己改善、自己不和、および自己整合など、さまざまなプロンプト戦略を調査します。
結果は、LLMSが多言語ABSAの処理において有望であることを示していますが、一般的には微調整されたタスク固有のモデルには及ばないことを示しています。
特に、より単純なゼロショットプロンプトは、特に英語のような高リソース言語では、より複雑な戦略よりも優れていることがよくあります。
これらの調査結果は、LLMベースのアプローチのさらなる改良の必要性を強調しており、多様な言語全体でABSAタスクに効果的に対処します。

要約(オリジナル)

Aspect-based sentiment analysis (ABSA), a sequence labeling task, has attracted increasing attention in multilingual contexts. While previous research has focused largely on fine-tuning or training models specifically for ABSA, we evaluate large language models (LLMs) under zero-shot conditions to explore their potential to tackle this challenge with minimal task-specific adaptation. We conduct a comprehensive empirical evaluation of a series of LLMs on multilingual ABSA tasks, investigating various prompting strategies, including vanilla zero-shot, chain-of-thought (CoT), self-improvement, self-debate, and self-consistency, across nine different models. Results indicate that while LLMs show promise in handling multilingual ABSA, they generally fall short of fine-tuned, task-specific models. Notably, simpler zero-shot prompts often outperform more complex strategies, especially in high-resource languages like English. These findings underscore the need for further refinement of LLM-based approaches to effectively address ABSA task across diverse languages.

arxiv情報

著者	Chengyan Wu,Bolei Ma,Zheyu Zhang,Ningyuan Deng,Yanqing He,Yun Xue
発行日	2025-06-09 13:09:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.CL | コメントを受け付けていません

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント