「cs.CL」カテゴリーアーカイブ

Transformers Learn Low Sensitivity Functions: Investigations and Implications

投稿日: 2025年2月14日作成者: jarxiv

要約トランスは、多くのタスクにわたって最先端の精度と堅牢性を実現しますが、それ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, stat.ML | コメントを受け付けていません

Theoretical Benefit and Limitation of Diffusion Language Model

投稿日: 2025年2月14日作成者: jarxiv

要約拡散言語モデルは、テキスト生成の有望なアプローチとして浮上しています。複 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, stat.ML | コメントを受け付けていません

Pixel-Level Reasoning Segmentation via Multi-turn Conversations

投稿日: 2025年2月14日作成者: jarxiv

要約既存の視覚認識システムは、複雑で明示的なクエリの指示に依存して、一ターンダ … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

投稿日: 2025年2月14日作成者: jarxiv

要約具体化されたエージェントを作成するためにマルチモーダルの大手言語モデル（M … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Optimizing GPT for Video Understanding: Zero-Shot Performance and Prompt Engineering

投稿日: 2025年2月14日作成者: jarxiv

要約この調査では、ビデオ品質の7つの重要なカテゴリにわたってゼロショット分類の … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Exploring the Potential of Encoder-free Architectures in 3D LMMs

投稿日: 2025年2月14日作成者: jarxiv

要約エンコーダーフリーのアーキテクチャは、2Dビジュアルドメインで事前に検討さ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

投稿日: 2025年2月14日作成者: jarxiv

要約チェーンオブシュート（COT）で質問に答えることで、大規模な言語モデル（L … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Salamandra Technical Report

投稿日: 2025年2月14日作成者: jarxiv

要約この作業では、3つの異なるサイズのオープンソースデコーダーのみの大型言語モ … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Faithful, Unfaithful or Ambiguous? Multi-Agent Debate with Initial Stance for Summary Evaluation

投稿日: 2025年2月14日作成者: jarxiv

要約大規模な言語モデル（LLM）に基づいた忠実さの評価者は、テキストの流enc … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Better Embeddings with Coupled Adam

投稿日: 2025年2月14日作成者: jarxiv

要約それらの驚くべき能力にもかかわらず、LLMSは、異方性の望ましくないが理解 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

「cs.CL」カテゴリーアーカイブ

Transformers Learn Low Sensitivity Functions: Investigations and Implications

Theoretical Benefit and Limitation of Diffusion Language Model

Pixel-Level Reasoning Segmentation via Multi-turn Conversations

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

Optimizing GPT for Video Understanding: Zero-Shot Performance and Prompt Engineering

Exploring the Potential of Encoder-free Architectures in 3D LMMs

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

Salamandra Technical Report

Faithful, Unfaithful or Ambiguous? Multi-Agent Debate with Initial Stance for Summary Evaluation

Better Embeddings with Coupled Adam

最近の投稿

最近のコメント

アーカイブ

カテゴリー