「cs.CL」カテゴリーアーカイブ

Behind the Magic, MERLIM: Multi-modal Evaluation Benchmark for Large Image-Language Models

投稿日: 2024年6月13日作成者: jarxiv

要約大規模なビジョンおよび言語モデルにより、完全に監視されたゼロショットの視覚 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

投稿日: 2024年6月13日作成者: jarxiv

要約マルチモーダル言語言語モデル (MLLM) は、「ワールドモデル」、つま … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

What If We Recaption Billions of Web Images with LLaMA-3?

投稿日: 2024年6月13日作成者: jarxiv

要約 Web クロールされた画像とテキストのペアは本質的にノイズが多くなります。 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation

投稿日: 2024年6月13日作成者: jarxiv

要約拡散モデルはテキストから画像への生成における最先端技術ですが、その知覚の変 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples

投稿日: 2024年6月13日作成者: jarxiv

要約私たちは、対比モデルと生成マルチモーダルモデルの両方の視覚言語的構成推論 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

投稿日: 2024年6月13日作成者: jarxiv

要約言語と 3D 認識の統合は、物理世界を理解し、相互作用する身体化されたエー … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

AI Sandbagging: Language Models can Strategically Underperform on Evaluations

投稿日: 2024年6月13日作成者: jarxiv

要約信頼できる機能評価は AI システムの安全性を確保するために不可欠であり、 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CY, cs.LG | コメントを受け付けていません

CADS: A Systematic Literature Review on the Challenges of Abstractive Dialogue Summarization

投稿日: 2024年6月13日作成者: jarxiv

要約抽象的な対話の要約は、会話を有益で簡潔な要約に抽出するタスクです。このテ … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented Generation for Question-Answering

投稿日: 2024年6月13日作成者: jarxiv

要約検索拡張生成 (RAG) は、質問応答 (QA) などの知識集約型タスクに … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning

投稿日: 2024年6月12日作成者: jarxiv

要約汎用人工知能 (AGI) の追求は、優れた推論、一般化能力、およびマルチモ … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.RO | コメントを受け付けていません

「cs.CL」カテゴリーアーカイブ

Behind the Magic, MERLIM: Multi-modal Evaluation Benchmark for Large Image-Language Models

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

What If We Recaption Billions of Web Images with LLaMA-3?

Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation

CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

AI Sandbagging: Language Models can Strategically Underperform on Evaluations

CADS: A Systematic Literature Review on the Challenges of Abstractive Dialogue Summarization

DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented Generation for Question-Answering

EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning

最近の投稿

最近のコメント

アーカイブ

カテゴリー