jarxiv | Japanese arxiv | ページ 830

Barren plateaus are amplified by the dimension of qudits

投稿日: 2025年4月22日作成者: jarxiv

要約

変分量子アルゴリズム（VQA）は、特に量子ニューラルネットワーク内で、多様な科学的および技術的ドメインで量子優位性を達成するための極めて重要な戦略として浮上しています。
しかし、彼らの可能性にもかかわらず、VQAは重大な障害に遭遇し、その中で主なものは、一般的に不毛のプラトーと呼ばれる消滅する勾配の問題です。
この記事では、細心の分析を通じて、既存の文献が不毛のプラトーに対するQudit次元の本質的な影響を暗黙的に示唆していることを実証します。
これらの調査結果をインスタンス化するために、不毛のプラトーに対するQudit次元の影響を例示する数値結果を提示します。
したがって、さまざまなエラー緩和手法の提案にもかかわらず、我々の結果は、quditsを持つVQAのコンテキストでの有効性についてさらに精査する必要があります。

要約(オリジナル)

Variational Quantum Algorithms (VQAs) have emerged as pivotal strategies for attaining quantum advantage in diverse scientific and technological domains, notably within Quantum Neural Networks. However, despite their potential, VQAs encounter significant obstacles, chief among them being the vanishing gradient problem, commonly referred to as barren plateaus. In this article, through meticulous analysis, we demonstrate that existing literature implicitly suggests the intrinsic influence of qudit dimensionality on barren plateaus. To instantiate these findings, we present numerical results that exemplify the impact of qudit dimensionality on barren plateaus. Therefore, despite the proposition of various error mitigation techniques, our results call for further scrutiny about their efficacy in the context of VQAs with qudits.

arxiv情報

著者	Lucas Friedrich,Tiago de Souza Farias,Jonas Maziero
発行日	2025-04-21 12:07:03+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, quant-ph | コメントを受け付けていません

Exploring Commonalities in Explanation Frameworks: A Multi-Domain Survey Analysis

投稿日: 2025年4月22日作成者: jarxiv

要約

この研究では、3つのドメインの専門家との調査と議論から集められた洞察を提示し、これらおよび他の同様のユースケースに適用できる普遍的な説明フレームワークの重要な要素を見つけることを目指しています。
洞察は、解釈可能性で知られているGPアルゴリズムを利用するソフトウェアツールに組み込まれています。
分析されたアプリケーションには、医療シナリオ（予測MLを含む）、小売ユースケース（規定のMLを含む）、およびエネルギーユースケース（予測MLも関与）が含まれます。
私たちは各セクターの専門家にインタビューし、さらなる分析のために会話を書き起こしました。
さらに、これらの分野の専門家と非専門家は、説明方法のさまざまな側面を調査するために設計されたアンケートに記入しました。
調査結果は、より大きな説明可能性を支持して、ある程度の精度を犠牲にすることに対する普遍的な好みを示しています。
さらに、このようなフレームワークの重要なコンポーネントとしての機能の重要性と反事実的説明の重要性を強調します。
私たちのアンケートは、Xaiの分野での知識の普及を促進するために公開されています。

要約(オリジナル)

This study presents insights gathered from surveys and discussions with specialists in three domains, aiming to find essential elements for a universal explanation framework that could be applied to these and other similar use cases. The insights are incorporated into a software tool that utilizes GP algorithms, known for their interpretability. The applications analyzed include a medical scenario (involving predictive ML), a retail use case (involving prescriptive ML), and an energy use case (also involving predictive ML). We interviewed professionals from each sector, transcribing their conversations for further analysis. Additionally, experts and non-experts in these fields filled out questionnaires designed to probe various dimensions of explanatory methods. The findings indicate a universal preference for sacrificing a degree of accuracy in favor of greater explainability. Additionally, we highlight the significance of feature importance and counterfactual explanations as critical components of such a framework. Our questionnaires are publicly available to facilitate the dissemination of knowledge in the field of XAI.

arxiv情報

著者	Eduard Barbu,Marharyta Domnich,Raul Vicente,Nikos Sakkas,André Morim
発行日	2025-04-21 12:22:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.HC, cs.LG | コメントを受け付けていません

A direct proof of a unified law of robustness for Bregman divergence losses

投稿日: 2025年4月22日作成者: jarxiv

要約

現代の深い学習の実践では、モデルは多くの場合、トレーニングデータをほぼ補間するために、ゼロ損失に近いように訓練されています。
ただし、モデルのパラメーターの数は通常、データポイントnの数、補間に必要な理論的最小値：オーバーパラメーター化と呼ばれる現象です。
興味深い作業では、BubeckとSellkeは補間の自然な概念を考えました。モデルのトレーニング損失が共変量を考慮して応答の条件付き期待の喪失を下回ると、モデルは補間すると言われています。
補間のこの概念と、広範なクラスの共変量分布（具体的には測定の集中の自然な概念を満たすもの）のために、彼らは、堅牢な補間にオーバーパラメーター化が必要であることを示しました。
彼らの主な証明手法は、スカラー応答に対する平方損失による回帰に適用されますが、レドマチャーの複雑さへの接続と、Ledoux-Talagrand収縮の不平等などのツールを使用することで、少なくともスカラー応答変数の場合、より一般的な損失に拡張できることに注意してください。
この作業では、バイアス分散タイプの分解の観点からバベックとセルケの元の証明手法を再astし、このビューがレデマーチャーの複雑さやledoux-taland contraction storctipplyなどのツールを使用せずに、ブレグマンの分岐損失の一般化を（ベクトル値の応答でも）直接ロック解除することを示しています。
ブレグマンの発散は、自然な損失のクラスです。これらの場合、最良の推定量は共変量を考慮して応答の条件付き期待であり、交差エントロピー損失などの他の実用的な損失を含んでいます。
したがって、私たちの仕事は、BubeckとSellkeの主な証明技術をより一般的に理解し、その幅広い有用性を実証しています。

要約(オリジナル)

In contemporary deep learning practice, models are often trained to near zero loss i.e. to nearly interpolate the training data. However, the number of parameters in the model is usually far more than the number of data points n, the theoretical minimum needed for interpolation: a phenomenon referred to as overparameterization. In an interesting piece of work, Bubeck and Sellke considered a natural notion of interpolation: the model is said to interpolate when the model’s training loss goes below the loss of the conditional expectation of the response given the covariate. For this notion of interpolation and for a broad class of covariate distributions (specifically those satisfying a natural notion of concentration of measure), they showed that overparameterization is necessary for robust interpolation i.e. if the interpolating function is required to be Lipschitz. Their main proof technique applies to regression with square loss against a scalar response, but they remark that via a connection to Rademacher complexity and using tools such as the Ledoux-Talagrand contraction inequality, their result can be extended to more general losses, at least in the case of scalar response variables. In this work, we recast the original proof technique of Bubeck and Sellke in terms of a bias-variance type decomposition, and show that this view directly unlocks a generalization to Bregman divergence losses (even for vector-valued responses), without the use of tools such as Rademacher complexity or the Ledoux-Talagrand contraction principle. Bregman divergences are a natural class of losses since for these, the best estimator is the conditional expectation of the response given the covariate, and include other practical losses such as the cross entropy loss. Our work thus gives a more general understanding of the main proof technique of Bubeck and Sellke and demonstrates its broad utility.

arxiv情報

著者	Santanu Das,Jatin Batra,Piyush Srivastava
発行日	2025-04-21 12:53:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Mitigating Degree Bias in Graph Representation Learning with Learnable Structural Augmentation and Structural Self-Attention

投稿日: 2025年4月22日作成者: jarxiv

要約

グラフニューラルネットワーク（GNNS）メッセージの渡しを通してノード表現を更新します。これは、主に同性愛の原理に基づいており、隣接するノードが同様の機能を共有していると仮定します。
ただし、長期尾のある程度分布を備えた現実世界のグラフでは、高度ノードがメッセージの合格を支配し、メッセージが不十分なメッセージのために低いノードが過小評価されたままである程度のバイアスを引き起こします。
学位バイアスに対処する際の主な課題は、高度ノードに追加のメッセージを提供しながら、高度ノードの過度のメッセージを減らしながら、隣接していないノードを発見する方法です。
それにもかかわらず、非隣接ノードを悪用して貴重なメッセージを提供することは、騒々しい情報を生成し、元のグラフ構造を破壊する可能性があるため、挑戦的です。
それを解決するために、学習可能な構造的増強と構造的自己触たちを通じて非隣接ノード間の構造的類似性を発見することにより、学位バイアスを緩和するために、Degfairgtという名前の新しい程度の公平性グラフ変圧器を提案します。
私たちの重要なアイデアは、同じコミュニティで同様の役割を持つ非隣接ノードを活用して、私たちの増強の下で有益なエッジを生成することです。
Degfairgtがそのような構造的類似性を学習できるようにするために、ノードペア間の類似性を捉えるために構造的自己触媒を提案します。
グローバルグラフ構造を維持し、グラフの増強がグラフ構造を妨げるのを防ぐために、P-STEP遷移確率を維持し、グラフの増強を正規化するための自己監視学習タスクを提案します。
6つのデータセットでの広範な実験により、Degfairgtは、次の公平性分析、ノード分類、およびノードクラスタリングタスクの最先端のベースラインよりも優れていることが示されました。

要約(オリジナル)

Graph Neural Networks (GNNs) update node representations through message passing, which is primarily based on the homophily principle, assuming that adjacent nodes share similar features. However, in real-world graphs with long-tailed degree distributions, high-degree nodes dominate message passing, causing a degree bias where low-degree nodes remain under-represented due to inadequate messages. The main challenge in addressing degree bias is how to discover non-adjacent nodes to provide additional messages to low-degree nodes while reducing excessive messages for high-degree nodes. Nevertheless, exploiting non-adjacent nodes to provide valuable messages is challenging, as it could generate noisy information and disrupt the original graph structures. To solve it, we propose a novel Degree Fairness Graph Transformer, named DegFairGT, to mitigate degree bias by discovering structural similarities between non-adjacent nodes through learnable structural augmentation and structural self-attention. Our key idea is to exploit non-adjacent nodes with similar roles in the same community to generate informative edges under our augmentation, which could provide informative messages between nodes with similar roles while ensuring that the homophily principle is maintained within the community. To enable DegFairGT to learn such structural similarities, we then propose a structural self-attention to capture the similarities between node pairs. To preserve global graph structures and prevent graph augmentation from hindering graph structure, we propose a Self-Supervised Learning task to preserve p-step transition probability and regularize graph augmentation. Extensive experiments on six datasets showed that DegFairGT outperformed state-of-the-art baselines in degree fairness analysis, node classification, and node clustering tasks.

arxiv情報

著者	Van Thuy Hoang,Hyeon-Ju Jeon,O-Joun Lee
発行日	2025-04-21 13:03:40+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL

投稿日: 2025年4月22日作成者: jarxiv

要約

大規模な言語モデル（LLM）は、リレーショナルデータベースに関する自然言語の質問をSQLクエリに変換する際に印象的な機能を示しています。
最近の改善にもかかわらず、小さなLLMSは、ゼロショット学習（ZSL）設定の下で、複数のテーブルと複雑なSQLパターンを含む質問を処理するのに苦労しています。
監視された微調整（SFT）は、前提条件のモデルの知識の欠陥を部分的に補償しますが、マルチホップの推論を含むクエリに対処しながら不足しています。
このギャップを埋めるために、SFTの推論トレースを含むZSL内の思考プロセスを活用したり、強化学習（RL）戦略を採用するなど、ZSL内の思考プロセスを活用することから、推論能力を強化するためのさまざまなLLMトレーニング戦略が提案されています。
ただし、Text2SQLのパフォーマンスに対する推論の影響は、依然としてほとんど説明されていません。
このペーパーでは、LLMの推論機能が4つのベンチマークデータセットでText2SQLパフォーマンスにどの程度影響するかを調査します。
この目的のために、次のLLM設定を考慮します。（1）ZSL、一般的な推論を含むかどうか。
（2）SFT、タスク固有の推論トレースを持つ場合とない場合。
（3）RL、プライマリ報酬関数としての実行精度を活用します。
（4）SFT+RL、つまり、SFTとRLを組み合わせた2段階のアプローチ。
結果は、ZSLの下での汎用推論が、複雑なText2SQLのケースに取り組むのに効果がないことが証明されていることを示しています。
小さなLLMSは、SFTの恩恵を受けて、より大きなものよりもはるかに多くの推論であり、（弱い）モデルの事前削除のギャップを埋めます。
RLは一般に、特にSQLクエリにマルチホップの推論と複数のテーブルが含まれる場合、テストされたすべてのモデルとデータセットで有益です。
SFT+RLの小さなLLMSは、推論プロセスの一般性と実行精度の最適化との戦略的バランスのおかげで、ほとんどの複雑なデータセットで優れています。
RLのおかげで、The7B Qwen-Coder-2.5モデルは、鳥のデータセットで1,000億以上のモデルと同等の性能を発揮します。

要約(オリジナル)

Large Language Models (LLMs) have shown impressive capabilities in transforming natural language questions about relational databases into SQL queries. Despite recent improvements, small LLMs struggle to handle questions involving multiple tables and complex SQL patterns under a Zero-Shot Learning (ZSL) setting. Supervised Fine-Tuning (SFT) partially compensate the knowledge deficits in pretrained models but falls short while dealing with queries involving multi-hop reasoning. To bridge this gap, different LLM training strategies to reinforce reasoning capabilities have been proposed, ranging from leveraging a thinking process within ZSL, including reasoning traces in SFT, or adopt Reinforcement Learning (RL) strategies. However, the influence of reasoning on Text2SQL performance is still largely unexplored. This paper investigates to what extent LLM reasoning capabilities influence their Text2SQL performance on four benchmark datasets. To this end, it considers the following LLM settings: (1) ZSL, including general-purpose reasoning or not; (2) SFT, with and without task-specific reasoning traces; (3) RL, leveraging execution accuracy as primary reward function; (4) SFT+RL, i.e, a two-stage approach that combines SFT and RL. The results show that general-purpose reasoning under ZSL proves to be ineffective in tackling complex Text2SQL cases. Small LLMs benefit from SFT with reasoning much more than larger ones, bridging the gap of their (weaker) model pretraining. RL is generally beneficial across all tested models and datasets, particularly when SQL queries involve multi-hop reasoning and multiple tables. Small LLMs with SFT+RL excel on most complex datasets thanks to a strategic balance between generality of the reasoning process and optimization of the execution accuracy. Thanks to RL, the7B Qwen-Coder-2.5 model performs on par with 100+ Billion ones on the Bird dataset.

arxiv情報

著者	Simone Papicchio,Simone Rossi,Luca Cagliero,Paolo Papotti
発行日	2025-04-21 13:05:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.DB, cs.LG | コメントを受け付けていません

Application of Sensitivity Analysis Methods for Studying Neural Network Models

投稿日: 2025年4月22日作成者: jarxiv

要約

この研究は、入力データの摂動に対するニューラルネットワークの感度を分析し、それらの根本的なメカニズムを解釈するためのいくつかの方法の能力を示しています。
調査されたアプローチには、SOBOLグローバル感度分析、入力ピクセル摂動の局所感度方法、および活性化最大化手法が含まれます。
例として、この研究では、画像処理と分類で広く使用されている2つの古典的な畳み込みアーキテクチャ、VGG-16とResNet-18の2つの古典的な畳み込みアーキテクチャと同様に、臨床糖尿病データの開いた表データセットを分析するための小さなフィードフォワードニューラルネットワークを検討します。
グローバルな感度分析の利用により、選択した小さなニューラルネットワークの主要な入力パラメーターを特定し、精度を大幅に失うことなく数を減らすことができます。
グローバルな感度分析がより大きなモデルに適用できない限り、畳み込みニューラルネットワークへの応用において、局所感度分析と活性化最大化方法を試します。
これらの方法は、画像分類の問題を解決する畳み込みモデルの興味深いパターンを示しています。
全体として、活性化最大化法の結果を、超音波データ分析のコンテキストで人気のあるグラッドCAM技術と比較します。

要約(オリジナル)

This study demonstrates the capabilities of several methods for analyzing the sensitivity of neural networks to perturbations of the input data and interpreting their underlying mechanisms. The investigated approaches include the Sobol global sensitivity analysis, the local sensitivity method for input pixel perturbations and the activation maximization technique. As examples, in this study we consider a small feedforward neural network for analyzing an open tabular dataset of clinical diabetes data, as well as two classical convolutional architectures, VGG-16 and ResNet-18, which are widely used in image processing and classification. Utilization of the global sensitivity analysis allows us to identify the leading input parameters of the chosen tiny neural network and reduce their number without significant loss of the accuracy. As far as global sensitivity analysis is not applicable to larger models we try the local sensitivity analysis and activation maximization method in application to the convolutional neural networks. These methods show interesting patterns for the convolutional models solving the image classification problem. All in all, we compare the results of the activation maximization method with popular Grad-CAM technique in the context of ultrasound data analysis.

arxiv情報

著者	Jiaxuan Miao,Sergey Matveev
発行日	2025-04-21 13:41:20+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: 68T07, cs.LG, cs.NA, F.2.1, math.NA | コメントを受け付けていません

Kolmogorov-Arnold Networks: Approximation and Learning Guarantees for Functions and their Derivatives

投稿日: 2025年4月22日作成者: jarxiv

要約

Kolmogorov-Arnoldの重ね合わせ定理に触発されたKolmogorov-Arnold Networks（KANS）は、最近、ほとんどの深い学習フレームワークの改善されたバックボーンとして浮上し、訓練可能なスプラインベースの活性化関数を可能にすることにより、多層知覚（MLP）の前身よりも多くの適応性を約束しました。
このホワイトペーパーでは、$ b^{s} _ {p、q}（\ mathcal {x}）$で$ b^{s} _ {p、q}（\ mathcal {x}）$で$ b^{s} _で最適に近似できることを示すことにより、Kan建築の理論的基礎を調べます。
弱いbesov norm $ b^{\ alpha} _ {p、q}（\ mathcal {x}）$;
ここで、$ \ alpha ~~要約(オリジナル)~~

Inspired by the Kolmogorov-Arnold superposition theorem, Kolmogorov-Arnold Networks (KANs) have recently emerged as an improved backbone for most deep learning frameworks, promising more adaptivity than their multilayer perception (MLP) predecessor by allowing for trainable spline-based activation functions. In this paper, we probe the theoretical foundations of the KAN architecture by showing that it can optimally approximate any Besov function in $B^{s}_{p,q}(\mathcal{X})$ on a bounded open, or even fractal, domain $\mathcal{X}$ in $\mathbb{R}^d$ at the optimal approximation rate with respect to any weaker Besov norm $B^{\alpha}_{p,q}(\mathcal{X})$; where $\alpha < s$. We complement our approximation guarantee with a dimension-free estimate on the sample complexity of a residual KAN model when learning a function of Besov regularity from $N$ i.i.d. noiseless samples. Our KAN architecture incorporates contemporary deep learning wisdom by leveraging residual/skip connections between layers.

arxiv情報

著者 Anastasis Kratsios,Takashi Furuya

発行日 2025-04-21 14:02:59+00:00

arxivサイト arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, cs.NA, cs.NE, math.FA, math.NA, stat.ML | Kolmogorov-Arnold Networks: Approximation and Learning Guarantees for Functions and their Derivatives はコメントを受け付けていません

Advanced posterior analyses of hidden Markov models: finite Markov chain imbedding and hybrid decoding

投稿日: 2025年4月22日作成者: jarxiv

要約

隠されたマルコフモデルのアプリケーションにおける2つの主要なタスクは、（i）隠された状態シーケンスの要約統計の分布を計算し、（ii）非表示状態シーケンスをデコードすることです。
これらの2つのタスクのそれぞれを解決するために、有限のマルコフチェーンImbedding（FMCI）とハイブリッドデコードについて説明します。
私たちの論文の最初の部分では、FMCIを使用して、隠された状態への訪問数、隠された状態で費やされた合計時間、隠された状態での滞留時間、および最も長い実行の長さなどの要約統計の後部分布を計算します。
FMCIフレームワークを確立するために、観測されたシーケンスを条件とする非表示状態シーケンスのシミュレーションを使用します。
論文の第2部では、HMMのデコードを改善するためにハイブリッドセグメンテーションを適用します。
ハイブリッドデコードは、ViterBIまたは後部デコード（しばしばグローバルまたはローカルデコードとも呼ばれる）と比較してパフォーマンスが向上することを示し、ハイブリッド手順でチューニングパラメーターを選択するための新しい手順を紹介します。
さらに、加重された幾何平均に基づいて、ハイブリッド損失関数の代替導出を提供します。
さまざまな古典的なデータセットでFMCIおよびハイブリッドデコードを実証および適用し、再現性のために添付のコードを提供します。

要約(オリジナル)

Two major tasks in applications of hidden Markov models are to (i) compute distributions of summary statistics of the hidden state sequence, and (ii) decode the hidden state sequence. We describe finite Markov chain imbedding (FMCI) and hybrid decoding to solve each of these two tasks. In the first part of our paper we use FMCI to compute posterior distributions of summary statistics such as the number of visits to a hidden state, the total time spent in a hidden state, the dwell time in a hidden state, and the longest run length. We use simulations from the hidden state sequence, conditional on the observed sequence, to establish the FMCI framework. In the second part of our paper we apply hybrid segmentation for improved decoding of a HMM. We demonstrate that hybrid decoding shows increased performance compared to Viterbi or Posterior decoding (often also referred to as global or local decoding), and we introduce a novel procedure for choosing the tuning parameter in the hybrid procedure. Furthermore, we provide an alternative derivation of the hybrid loss function based on weighted geometric means. We demonstrate and apply FMCI and hybrid decoding on various classical data sets, and supply accompanying code for reproducibility.

arxiv情報

著者 Zenia Elise Damgaard Bæk,Moisès Coll Macià,Laurits Skov,Asger Hobolth

発行日 2025-04-21 14:58:35+00:00

arxivサイト arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, stat.ML | Advanced posterior analyses of hidden Markov models: finite Markov chain imbedding and hybrid decoding はコメントを受け付けていません

Survey of Loss Augmented Knowledge Tracing

投稿日: 2025年4月22日作成者: jarxiv

要約

人工ニューラルネットワークのトレーニングは、適切な損失関数の慎重な選択に大きく依存しています。
一般的に使用される損失関数は、広範なタスクで一般的に十分な課題で十分であり、学習プロセス内のデータ品質の制限または非効率性のために課題がしばしば出現します。
このような状況では、補足用語の損失関数への統合は、これらの課題に対処し、モデルのパフォーマンスと堅牢性の両方を高めるのに役立ちます。
2つの顕著な手法、損失の正規化と対照学習は、人工ニューラルネットワークの損失関数の能力を強化するための効果的な戦略として特定されています。
知識トレースは、予測的な人工知能を活用して、学生のパーソナライズされた効率的な教育体験の自動化を促進する説得力のある研究分野です。
この論文では、高度な損失関数を使用してトレーニングされたディープラーニングベースの知識トレース（DKT）アルゴリズムの包括的なレビューを提供し、以前の手法での改善について議論します。
BI-CLKT、CL4KT、SP-CLKT、COSKT、予測親和なDKTなどのコントラスト知識トレースアルゴリズムについて説明し、実際の展開の課題に関するパフォーマンスベンチマークと洞察を提供します。
この調査は、ハイブリッド損失戦略やコンテキスト認識モデリングなど、将来の研究の方向性で終わります。

要約(オリジナル)

The training of artificial neural networks is heavily dependent on the careful selection of an appropriate loss function. While commonly used loss functions, such as cross-entropy and mean squared error (MSE), generally suffice for a broad range of tasks, challenges often emerge due to limitations in data quality or inefficiencies within the learning process. In such circumstances, the integration of supplementary terms into the loss function can serve to address these challenges, enhancing both model performance and robustness. Two prominent techniques, loss regularization and contrastive learning, have been identified as effective strategies for augmenting the capacity of loss functions in artificial neural networks. Knowledge tracing is a compelling area of research that leverages predictive artificial intelligence to facilitate the automation of personalized and efficient educational experiences for students. In this paper, we provide a comprehensive review of the deep learning-based knowledge tracing (DKT) algorithms trained using advanced loss functions and discuss their improvements over prior techniques. We discuss contrastive knowledge tracing algorithms, such as Bi-CLKT, CL4KT, SP-CLKT, CoSKT, and prediction-consistent DKT, providing performance benchmarks and insights into real-world deployment challenges. The survey concludes with future research directions, including hybrid loss strategies and context-aware modeling.

arxiv情報

著者 Altun Shukurlu

発行日 2025-04-21 15:09:40+00:00

arxivサイト arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | Survey of Loss Augmented Knowledge Tracing はコメントを受け付けていません

Audio-Visual Class-Incremental Learning for Fish Feeding intensity Assessment in Aquaculture

投稿日: 2025年4月22日作成者: jarxiv

要約

魚の摂食強度評価（FFIA）は、産業用養殖管理において重要です。
最近のマルチモーダルアプローチは、FFIAの堅牢性と効率を改善することに有望であることを示しています。
ただし、これらの方法は、壊滅的な忘却と適切なデータセットの欠如のために、新しい魚種や環境に適応する際に大きな課題に直面しています。
これらの制限に対処するために、最初に、実際の水産養殖環境で6つの異なる魚種にわたって摂食強度をキャプチャする81,932のラベルのあるオーディオビジュアルクリップを含む新しいデータセットであるAV-CIL-FFIAを導入します。
次に、FFIAのオーディオビジュアルクラス増分学習（CIL）の先駆者であり、AV-CIL-FFIAのベンチマークを通じてシングルモダリティメソッドを大幅に上回ることを実証します。
既存のCILメソッドは、履歴データに大きく依存しています。
模範ベースのアプローチは生のサンプルを保存し、ストレージの課題を作成しますが、模範を含まない方法はデータストレージを避けますが、異なる魚種で微妙な摂食強度の変動を区別するのに苦労しています。
これらの制限を克服するために、コンパクトな特徴表現を通じて本質的な知識を維持しながら模範的な効率を達成するプロトタイプベースのアプローチでこのギャップを橋渡しする新しい音声視聴覚クラスと領域の学習フレームワークであるHail-Ffiaを紹介します。
具体的には、Hail-Ffiaは、一般的な強度の知識を魚固有の特性と分離するデュアルパス知識保存メカニズムで階層表現学習を採用しています。
さらに、摂食行動段階に基づいて、オーディオと視覚情報の重要性を適応的に調整する動的モダリティバランスシステムを備えています。
実験結果は、Hail-FfiaがAV-CIL-FFIAのSOTAメソッドよりも優れており、より低いストレージニーズでより高い精度を達成しながら、漸進的な魚種の学習における壊滅的な忘却を効果的に緩和することを示しています。

要約(オリジナル)

Fish Feeding Intensity Assessment (FFIA) is crucial in industrial aquaculture management. Recent multi-modal approaches have shown promise in improving FFIA robustness and efficiency. However, these methods face significant challenges when adapting to new fish species or environments due to catastrophic forgetting and the lack of suitable datasets. To address these limitations, we first introduce AV-CIL-FFIA, a new dataset comprising 81,932 labelled audio-visual clips capturing feeding intensities across six different fish species in real aquaculture environments. Then, we pioneer audio-visual class incremental learning (CIL) for FFIA and demonstrate through benchmarking on AV-CIL-FFIA that it significantly outperforms single-modality methods. Existing CIL methods rely heavily on historical data. Exemplar-based approaches store raw samples, creating storage challenges, while exemplar-free methods avoid data storage but struggle to distinguish subtle feeding intensity variations across different fish species. To overcome these limitations, we introduce HAIL-FFIA, a novel audio-visual class-incremental learning framework that bridges this gap with a prototype-based approach that achieves exemplar-free efficiency while preserving essential knowledge through compact feature representations. Specifically, HAIL-FFIA employs hierarchical representation learning with a dual-path knowledge preservation mechanism that separates general intensity knowledge from fish-specific characteristics. Additionally, it features a dynamic modality balancing system that adaptively adjusts the importance of audio versus visual information based on feeding behaviour stages. Experimental results show that HAIL-FFIA is superior to SOTA methods on AV-CIL-FFIA, achieving higher accuracy with lower storage needs while effectively mitigating catastrophic forgetting in incremental fish species learning.

arxiv情報

著者 Meng Cui,Xianghu Yue,Xinyuan Qian,Jinzheng Zhao,Haohe Liu,Xubo Liu,Daoliang Li,Wenwu Wang

発行日 2025-04-21 15:24:34+00:00

arxivサイト arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | Audio-Visual Class-Incremental Learning for Fish Feeding intensity Assessment in Aquaculture はコメントを受け付けていません

← 過去の投稿

新しい投稿 →

検索

最近の投稿

FEAST: A Flexible Mealtime-Assistance System Towards In-the-Wild Personalization

Time-Optimized Safe Navigation in Unstructured Environments through Learning Based Depth Completion

Advances in Compliance Detection: Novel Models Using Vision-Based Tactile Sensors

Mass-Adaptive Admittance Control for Robotic Manipulators

DreamGen: Unlocking Generalization in Robot Learning through Video World Models

最近のコメント

表示できるコメントはありません。

cs.AI (39879) cs.CL (30187) cs.CV (45175) cs.HC (3051) cs.LG (44808) cs.RO (23879) cs.SY (3632) eess.IV (5170) eess.SY (3624) stat.ML (5830)

著者	Anastasis Kratsios,Takashi Furuya
発行日	2025-04-21 14:02:59+00:00
arxivサイト	arxiv_id(pdf)

著者	Zenia Elise Damgaard Bæk,Moisès Coll Macià,Laurits Skov,Asger Hobolth
発行日	2025-04-21 14:58:35+00:00
arxivサイト	arxiv_id(pdf)

著者	Altun Shukurlu
発行日	2025-04-21 15:09:40+00:00
arxivサイト	arxiv_id(pdf)

著者	Meng Cui,Xianghu Yue,Xinyuan Qian,Jinzheng Zhao,Haohe Liu,Xubo Liu,Daoliang Li,Wenwu Wang
発行日	2025-04-21 15:24:34+00:00
arxivサイト	arxiv_id(pdf)