投稿者「jarxiv」のアーカイブ

What’s the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns

投稿日: 2025年4月23日作成者: jarxiv

要約大規模な言語モデルの迅速なエンジニアリングは挑戦的です。小さな迅速な摂動や … 続きを読む →

カテゴリー: cs.CL, cs.HC, cs.LG | コメントを受け付けていません

Fine-tuning Whisper on Low-Resource Languages for Real-World Applications

投稿日: 2025年4月23日作成者: jarxiv

要約このペーパーでは、Swissドイツ語をケーススタディとして使用して、文レベ … 続きを読む →

カテゴリー: cs.CL, eess.AS | コメントを受け付けていません

Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model

投稿日: 2025年4月23日作成者: jarxiv

要約直接選好最適化（DPO）は、明示的な報酬モデルなしで人間の好みを最適化する … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Exploring Cognitive and Aesthetic Causality for Multimodal Aspect-Based Sentiment Analysis

投稿日: 2025年4月23日作成者: jarxiv

要約マルチモーダルアスペクトベースのセンチメント分類（MASC）は、特定のアス … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Aggregating Soft Labels from Crowd Annotations Improves Uncertainty Estimation Under Distribution Shift

投稿日: 2025年4月23日作成者: jarxiv

要約機械学習タスクの効果的なトレーニング信号を選択することは困難です。専門家の … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

SARI: Structured Audio Reasoning via Curriculum-Guided Reinforcement Learning

投稿日: 2025年4月23日作成者: jarxiv

要約最近の研究は、Rehnecortion Learning（RL）が、「答え … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

On the Low-Rank Parametrization of Reward Models for Controlled Language Generation

投稿日: 2025年4月23日作成者: jarxiv

要約大量のデータで訓練された言語モデルは、場合によっては不適切なコンテンツを生 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Open-World Evaluation for Retrieving Diverse Perspectives

投稿日: 2025年4月23日作成者: jarxiv

要約複雑で論争の多い質問に関するさまざまな視点をカバーする一連のドキュメントの … 続きを読む →

カテゴリー: cs.CL, cs.IR | コメントを受け付けていません

Optimizing RLHF Training for Large Language Models with Stage Fusion

投稿日: 2025年4月23日作成者: jarxiv

要約人間のフィードバック（RLHF）からの補強学習のための段階的融合を備えた効 … 続きを読む →

カテゴリー: cs.CL, cs.DC, cs.LG | コメントを受け付けていません

SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models

投稿日: 2025年4月23日作成者: jarxiv

要約大規模な言語モデル（LLMS）の成功にもかかわらず、彼らは依然として高い推 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

What’s the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns

Fine-tuning Whisper on Low-Resource Languages for Real-World Applications

Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model

Exploring Cognitive and Aesthetic Causality for Multimodal Aspect-Based Sentiment Analysis

Aggregating Soft Labels from Crowd Annotations Improves Uncertainty Estimation Under Distribution Shift

SARI: Structured Audio Reasoning via Curriculum-Guided Reinforcement Learning

On the Low-Rank Parametrization of Reward Models for Controlled Language Generation

Open-World Evaluation for Retrieving Diverse Perspectives

Optimizing RLHF Training for Large Language Models with Stage Fusion

SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー