月別アーカイブ: 2025年3月

Tracking Meets Large Multimodal Models for Driving Scenario Understanding

投稿日: 2025年3月19日作成者: jarxiv

要約大規模なマルチモーダルモデル（LMM）は最近、自律運転研究で顕著になり、さ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Utilization of Neighbor Information for Image Classification with Different Levels of Supervision

投稿日: 2025年3月19日作成者: jarxiv

要約一般化されたカテゴリ発見（GCD）と画像クラスタリングの両方でうまく機能す … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Advances in 4D Generation: A Survey

投稿日: 2025年3月19日作成者: jarxiv

要約生成的人工知能は、近年、複数のドメインにわたって顕著な進歩を目撃しています … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

The Power of Context: How Multimodality Improves Image Super-Resolution

投稿日: 2025年3月19日作成者: jarxiv

要約シングルイメージの超解像度（SISR）は、細かい詳細を回復し、低解像度の入 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Aligning Multimodal LLM with Human Preference: A Survey

投稿日: 2025年3月19日作成者: jarxiv

要約大規模な言語モデル（LLMS）は、タスク固有のトレーニングを必要とせずに、 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

MusicInfuser: Making Video Diffusion Listen and Dance

投稿日: 2025年3月19日作成者: jarxiv

要約 MusicInfuserを紹介します。これは、指定された音楽トラックに同期 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

TablePilot: Recommending Human-Preferred Tabular Data Analysis with Large Language Models

投稿日: 2025年3月19日作成者: jarxiv

要約多くのシナリオでは表形式のデータ分析が重要ですが、新しいテーブルの最も関連 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

LLM-Match: An Open-Sourced Patient Matching Model Based on Large Language Models and Retrieval-Augmented Generation

投稿日: 2025年3月19日作成者: jarxiv

要約患者のマッチングとは、医療記録を試験の適格性基準と正確に特定して一致させる … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective

投稿日: 2025年3月19日作成者: jarxiv

要約大規模な言語モデル（LLM）は、主に適切に設計されたプロンプトによって駆動 … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

MAC: A Benchmark for Multiple Attributes Compositional Zero-Shot Learning

投稿日: 2025年3月19日作成者: jarxiv

要約構成ゼロショット学習（CZSL）は、見た構成からセマンティックプリミティブ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2025年3月

Tracking Meets Large Multimodal Models for Driving Scenario Understanding

Utilization of Neighbor Information for Image Classification with Different Levels of Supervision

Advances in 4D Generation: A Survey

The Power of Context: How Multimodality Improves Image Super-Resolution

Aligning Multimodal LLM with Human Preference: A Survey

MusicInfuser: Making Video Diffusion Listen and Dance

TablePilot: Recommending Human-Preferred Tabular Data Analysis with Large Language Models

LLM-Match: An Open-Sourced Patient Matching Model Based on Large Language Models and Retrieval-Augmented Generation

DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective

MAC: A Benchmark for Multiple Attributes Compositional Zero-Shot Learning

最近の投稿

最近のコメント

アーカイブ

カテゴリー