「cs.MM」カテゴリーアーカイブ

MetaDecorator: Generating Immersive Virtual Tours through Multimodality

投稿日: 2025年1月28日作成者: jarxiv

要約 Metadecoratorは、ユーザーが仮想スペースをパーソナライズできる … 続きを読む →

カテゴリー: cs.AI, cs.ET, cs.HC, cs.MM | コメントを受け付けていません

Mitigating GenAI-powered Evidence Pollution for Out-of-Context Multimodal Misinformation Detection

投稿日: 2025年1月27日作成者: jarxiv

要約大規模な生成的人工知能（Genai）モデルは大幅に成功しましたが、欺cep … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.CY, cs.MM | コメントを受け付けていません

Tune In, Act Up: Exploring the Impact of Audio Modality-Specific Edits on Large Audio Language Models in Jailbreak

投稿日: 2025年1月24日作成者: jarxiv

要約大規模言語モデル (LLM) は、さまざまな自然言語処理タスクにわたって優 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks

投稿日: 2025年1月23日作成者: jarxiv

要約この論文では、テーブルトップロールプレイングゲーム (TRPG) の … 続きを読む →

カテゴリー: cs.AI, cs.MM, cs.NE, cs.SD, eess.AS | コメントを受け付けていません

Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training

投稿日: 2025年1月23日作成者: jarxiv

要約自己監視型事前トレーニングの使用は、さまざまな視覚タスクのパフォーマンスを … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM | コメントを受け付けていません

GSVC: Efficient Video Representation and Compression Through 2D Gaussian Splatting

投稿日: 2025年1月23日作成者: jarxiv

要約 3D ガウススプラットは、静的な 3D シーンの革新的で効果的な学習され … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

Sketch and Patch: Efficient 3D Gaussian Representation for Man-Made Scenes

投稿日: 2025年1月23日作成者: jarxiv

要約 3D ガウススプラッティング (3DGS) は、3D シーンのフォトリア … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation

投稿日: 2025年1月20日作成者: jarxiv

要約表情豊かな人間の姿勢と形状の推定 (EHPS) は、体、手、顔のモーション … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.HC, cs.MM, cs.RO | コメントを受け付けていません

CLIP-PCQA: Exploring Subjective-Aligned Vision-Language Modeling for Point Cloud Quality Assessment

投稿日: 2025年1月20日作成者: jarxiv

要約近年、無参照点群品質評価 (NR-PCQA) 研究は大きな進歩を遂げていま … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

Robust Change Captioning in Remote Sensing: SECOND-CC Dataset and MModalCC Framework

投稿日: 2025年1月20日作成者: jarxiv

要約リモートセンシング変化キャプション (RSICC) は、両時間画像間の変 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.MM | コメントを受け付けていません

「cs.MM」カテゴリーアーカイブ

MetaDecorator: Generating Immersive Virtual Tours through Multimodality

Mitigating GenAI-powered Evidence Pollution for Out-of-Context Multimodal Misinformation Detection

Tune In, Act Up: Exploring the Impact of Audio Modality-Specific Edits on Large Audio Language Models in Jailbreak

Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks

Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training

GSVC: Efficient Video Representation and Compression Through 2D Gaussian Splatting

Sketch and Patch: Efficient 3D Gaussian Representation for Man-Made Scenes

SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation

CLIP-PCQA: Exploring Subjective-Aligned Vision-Language Modeling for Point Cloud Quality Assessment

Robust Change Captioning in Remote Sensing: SECOND-CC Dataset and MModalCC Framework

最近の投稿

最近のコメント

アーカイブ

カテゴリー