「cs.CL」カテゴリーアーカイブ

Words in Motion: Representation Engineering for Motion Forecasting

投稿日: 2024年6月18日作成者: jarxiv

要約動き予測は、過去の動きと環境コンテキストのシーケンスを将来の動きに変換しま … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Evaluating Task-based Effectiveness of MLLMs on Charts

投稿日: 2024年6月18日作成者: jarxiv

要約このペーパーでは、GPT-4V はチャート上の低レベルのデータ分析タスクに … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

See It from My Perspective: Diagnosing the Western Cultural Bias of Large Vision-Language Models in Image Understanding

投稿日: 2024年6月18日作成者: jarxiv

要約ビジョン言語モデル (VLM) は、多くの言語の画像に関するクエリに応答で … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

投稿日: 2024年6月18日作成者: jarxiv

要約このペーパーでは、ビデオおよびオーディオ指向のタスクにおける時空間モデリン … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance

投稿日: 2024年6月18日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) の展開により、視覚的な入力を … 続きを読む →

カテゴリー: cs.CL, cs.CR, cs.CV | コメントを受け付けていません

Ovis: Structural Embedding Alignment for Multimodal Large Language Model

投稿日: 2024年6月18日作成者: jarxiv

要約現在のマルチモーダル大規模言語モデル (MLLM) は通常、MLP などの … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning

投稿日: 2024年6月18日作成者: jarxiv

要約言語および視覚アシスタントの最近の進歩は素晴らしい機能を示していますが、透 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

mDPO: Conditional Preference Optimization for Multimodal Large Language Models

投稿日: 2024年6月18日作成者: jarxiv

要約直接優先最適化 (DPO) は、大規模言語モデル (LLM) の調整に効果 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models

投稿日: 2024年6月18日作成者: jarxiv

要約強化学習では、AI システムがトレーニング目標の指定を誤ったために大きな報 … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models

投稿日: 2024年6月18日作成者: jarxiv

要約 AI の最も高度な技術の 1 つである検索拡張生成 (RAG) は、信頼性 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.IR | コメントを受け付けていません

「cs.CL」カテゴリーアーカイブ

Words in Motion: Representation Engineering for Motion Forecasting

Evaluating Task-based Effectiveness of MLLMs on Charts

See It from My Perspective: Diagnosing the Western Cultural Bias of Large Vision-Language Models in Image Understanding

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance

Ovis: Structural Embedding Alignment for Multimodal Large Language Model

On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning

mDPO: Conditional Preference Optimization for Multimodal Large Language Models

Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models

A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー