「cs.CL」カテゴリーアーカイブ

Entity-Aware Multimodal Alignment Framework for News Image Captioning

投稿日: 2024年3月1日作成者: jarxiv

要約ニュース画像のキャプションタスクは、画像キャプションタスクの変形であり … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning

投稿日: 2024年3月1日作成者: jarxiv

要約テレビクリップなどの複雑でマルチモーダルなコンテンツに対して質問応答を実 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, I.2.10 | コメントを受け付けていません

Language Models Represent Beliefs of Self and Others

投稿日: 2024年3月1日作成者: jarxiv

要約心の理論 (ToM) として知られる精神状態の理解と帰属は、人間の社会的推 … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models

投稿日: 2024年3月1日作成者: jarxiv

要約 Large Vision Language Model (LVLM) は、 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Investigation of Adapter for Automatic Speech Recognition in Noisy Environment

投稿日: 2024年3月1日作成者: jarxiv

要約自動音声認識 (ASR) システムを目に見えない騒音環境に適応させることが … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

RNNs are not Transformers (Yet): The Key Bottleneck on In-context Retrieval

投稿日: 2024年3月1日作成者: jarxiv

要約この論文では、アルゴリズムの問題を解決するという観点から、リカレント … 続きを読む →

カテゴリー: cs.CL, cs.LG, stat.ML | コメントを受け付けていません

Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards

投稿日: 2024年3月1日作成者: jarxiv

要約大規模言語モデル (LLM) に対するきめ細かい制御は依然として大きな課題 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, stat.ML | コメントを受け付けていません

DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning

投稿日: 2024年2月29日作成者: jarxiv

要約マルチモーダル事前トレーニングは、自律ロボットにおける表現学習の次の 3 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Exploring Precision and Recall to assess the quality and diversity of LLMs

投稿日: 2024年2月29日作成者: jarxiv

要約この論文では、画像生成からテキスト生成までの精度と再現率のメトリクスの適応 … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

LLM Task Interference: An Initial Study on the Impact of Task-Switch in Conversational History

投稿日: 2024年2月29日作成者: jarxiv

要約最近の強力な命令調整型大規模言語モデル (LLM) の出現により、さまざま … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

「cs.CL」カテゴリーアーカイブ

Entity-Aware Multimodal Alignment Framework for News Image Captioning

TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning

Language Models Represent Beliefs of Self and Others

A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models

Investigation of Adapter for Automatic Speech Recognition in Noisy Environment

RNNs are not Transformers (Yet): The Key Bottleneck on In-context Retrieval

Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards

DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning

Exploring Precision and Recall to assess the quality and diversity of LLMs

LLM Task Interference: An Initial Study on the Impact of Task-Switch in Conversational History

最近の投稿

最近のコメント

アーカイブ

カテゴリー