「cs.CL」カテゴリーアーカイブ

Do Large Multimodal Models Solve Caption Generation for Scientific Figures? Lessons Learned from SCICAP Challenge 2023

投稿日: 2025年2月18日作成者: jarxiv

要約 SCICAPデータセットが2021年に開始されて以来、研究コミュニティは学 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Understanding Figurative Meaning through Explainable Visual Entailment

投稿日: 2025年2月18日作成者: jarxiv

要約大規模なビジョン言語モデル（VLM）は、視覚的な質問や視覚的誘惑など、画像 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Unhackable Temporal Rewarding for Scalable Video MLLMs

投稿日: 2025年2月18日作成者: jarxiv

要約優れたビデオ処理MLLMを追求するために、私たちは困惑するパラドックスに遭 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

CLEAR: Character Unlearning in Textual and Visual Modalities

投稿日: 2025年2月18日作成者: jarxiv

要約 Machine Ulderning（MU）は、深い学習モデルからプライベー … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

投稿日: 2025年2月18日作成者: jarxiv

要約マルチモーダル大手言語モデル（MLLMS）の急速な進行により、さまざまなマ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.MM | コメントを受け付けていません

PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection

投稿日: 2025年2月18日作成者: jarxiv

要約ビジュアルインストラクションチューニングにより、事前に訓練されたマルチモー … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

ORI: O Routing Intelligence

投稿日: 2025年2月18日作成者: jarxiv

要約単一の大きな言語モデル（LLM）は、成長し続ける範囲のタスクに直面したとき … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

投稿日: 2025年2月18日作成者: jarxiv

要約 30Bパラメーターと最大204フレームの長さまでのビデオを生成する機能を備 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

SELP: Generating Safe and Efficient Task Plans for Robot Agents with Large Language Models

投稿日: 2025年2月17日作成者: jarxiv

要約ロボットエージェントの自然言語（NL）コマンドの理解と実行を強化する大規模 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.FL, cs.RO | コメントを受け付けていません

Probabilistic Lexical Manifold Construction in Large Language Models via Hierarchical Vector Field Interpolation

投稿日: 2025年2月17日作成者: jarxiv

要約階層ベクトルフィールド補間は、語彙表現のための構造化された確率的フレームワ … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

「cs.CL」カテゴリーアーカイブ

Do Large Multimodal Models Solve Caption Generation for Scientific Figures? Lessons Learned from SCICAP Challenge 2023

Understanding Figurative Meaning through Explainable Visual Entailment

Unhackable Temporal Rewarding for Scalable Video MLLMs

CLEAR: Character Unlearning in Textual and Visual Modalities

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection

ORI: O Routing Intelligence

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

SELP: Generating Safe and Efficient Task Plans for Robot Agents with Large Language Models

Probabilistic Lexical Manifold Construction in Large Language Models via Hierarchical Vector Field Interpolation

最近の投稿

最近のコメント

アーカイブ

カテゴリー