「cs.MM」カテゴリーアーカイブ

Learning Domain-Invariant Features for Out-of-Context News Detection

投稿日: 2024年8月9日作成者: jarxiv

要約文脈を無視したニュースは、オンラインメディアプラットフォームでよく見ら … 続きを読む →

カテゴリー: cs.CL, cs.MM | コメントを受け付けていません

MM-Forecast: A Multimodal Approach to Temporal Event Forecasting with Large Language Models

投稿日: 2024年8月9日作成者: jarxiv

要約私たちは、大規模な言語モデルを使用したマルチモーダル時間イベント予測という … 続きを読む →

カテゴリー: cs.AI, cs.IR, cs.MM, H.3.3 | コメントを受け付けていません

Edit As You Wish: Video Caption Editing with Multi-grained User Control

投稿日: 2024年8月9日作成者: jarxiv

要約ユーザーのリクエストに応じて自然言語でビデオを自動的にナレーションすること … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses

投稿日: 2024年8月9日作成者: jarxiv

要約ビデオグラウンディングは、マルチモーダルコンテンツの理解における基本的 … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis

投稿日: 2024年8月8日作成者: jarxiv

要約テキストから画像へのモデルのカスタマイズは大幅に進歩しましたが、複数のパー … 続きを読む →

カテゴリー: 68U10, cs.AI, cs.CV, cs.MM, I.4.9 | コメントを受け付けていません

HiQuE: Hierarchical Question Embedding Network for Multimodal Depression Detection

投稿日: 2024年8月8日作成者: jarxiv

要約自動うつ病検出を利用すると、うつ病を経験している人に対する早期介入が大幅に … 続きを読む →

カテゴリー: cs.AI, cs.MM | コメントを受け付けていません

SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses

投稿日: 2024年8月8日作成者: jarxiv

要約ビデオグラウンディングは、マルチモーダルコンテンツの理解における基本的 … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

New Job, New Gender? Measuring the Social Bias in Image Generation Models

投稿日: 2024年8月8日作成者: jarxiv

要約画像生成モデルは、指定されたテキストから画像を生成または編集できます。 D … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.MM, cs.SE | コメントを受け付けていません

MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model

投稿日: 2024年8月7日作成者: jarxiv

要約 LiDAR ベースの移動物体セグメンテーション (MOS) は、以前のスキ … 続きを読む →

カテゴリー: cs.CV, cs.MM, cs.RO, eess.IV | コメントを受け付けていません

A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning

投稿日: 2024年8月7日作成者: jarxiv

要約この論文は、マルチエージェントの議論をマルチモーダル推論に導入することを目 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.MA, cs.MM | コメントを受け付けていません

「cs.MM」カテゴリーアーカイブ

Learning Domain-Invariant Features for Out-of-Context News Detection

MM-Forecast: A Multimodal Approach to Temporal Event Forecasting with Large Language Models

Edit As You Wish: Video Caption Editing with Multi-grained User Control

SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses

Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis

HiQuE: Hierarchical Question Embedding Network for Multimodal Depression Detection

SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses

New Job, New Gender? Measuring the Social Bias in Image Generation Models

MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model

A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning

最近の投稿

最近のコメント

アーカイブ

カテゴリー