「cs.CL」カテゴリーアーカイブ

Minerva: A Programmable Memory Test Benchmark for Language Models

投稿日: 2025年6月10日作成者: jarxiv

要約 LLMベースのAIアシスタントは、メモリ（コンテキスト）をどの程度効果的に … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

WeQA: A Benchmark for Retrieval Augmented Generation in Wind Energy Domain

投稿日: 2025年6月10日作成者: jarxiv

要約 Wind Energy Project Assessmentは、意思決定者 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

ConECT Dataset: Overcoming Data Scarcity in Context-Aware E-Commerce MT

投稿日: 2025年6月10日作成者: jarxiv

要約ニューラルマシン翻訳（NMT）は、変圧器ベースのモデルを使用することで翻訳 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

WebUIBench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in WebUI-to-Code

投稿日: 2025年6月10日作成者: jarxiv

要約生成AIテクノロジーの急速な進歩により、マルチモーダルラージランゲージモデ … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Improving large language models with concept-aware fine-tuning

投稿日: 2025年6月10日作成者: jarxiv

要約大規模な言語モデル（LLM）は、現代AIの基礎となっています。ただし、次 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Beyond Numeric Rewards: In-Context Dueling Bandits with LLM Agents

投稿日: 2025年6月10日作成者: jarxiv

要約コンテキスト内補強学習（ICRL）は、基礎モデルの時代の強化学習（RL）の … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Learning to Focus: Causal Attention Distillation via Gradient-Guided Token Pruning

投稿日: 2025年6月10日作成者: jarxiv

要約大規模な言語モデル（LLM）は、コンテキストの理解に大幅な改善を実証してい … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Introspective Growth: Automatically Advancing LLM Expertise in Technology Judgment

投稿日: 2025年6月10日作成者: jarxiv

要約大規模な言語モデル（LLM）は、概念的な理解の兆候をますます示していますが … 続きを読む →

カテゴリー: cs.CL, cs.CY, cs.DL, cs.IR | コメントを受け付けていません

MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs

投稿日: 2025年6月10日作成者: jarxiv

要約実際のシステムに展開された言語モデルは、多くの場合、新しい知識または修正さ … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

投稿日: 2025年6月10日作成者: jarxiv

要約 GeminiやChatGptなどのマルチモーダルファンデーションモデルは、 … 続きを読む →

カテゴリー: cs.CL, eess.AS | コメントを受け付けていません

「cs.CL」カテゴリーアーカイブ

Minerva: A Programmable Memory Test Benchmark for Language Models

WeQA: A Benchmark for Retrieval Augmented Generation in Wind Energy Domain

ConECT Dataset: Overcoming Data Scarcity in Context-Aware E-Commerce MT

WebUIBench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in WebUI-to-Code

Improving large language models with concept-aware fine-tuning

Beyond Numeric Rewards: In-Context Dueling Bandits with LLM Agents

Learning to Focus: Causal Attention Distillation via Gradient-Guided Token Pruning

Introspective Growth: Automatically Advancing LLM Expertise in Technology Judgment

MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

最近の投稿

最近のコメント

アーカイブ

カテゴリー