月別アーカイブ: 2025年3月

3DSwapping: Texture Swapping For 3D Object From Single Reference Image

投稿日: 2025年3月25日作成者: jarxiv

要約 3Dテクスチャスワッピングにより、3Dオブジェクトテクスチャのカスタマイズ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

MC-LLaVA: Multi-Concept Personalized Vision-Language Model

投稿日: 2025年3月25日作成者: jarxiv

要約現在のビジョン言語モデル（VLM）は、視覚的な質問応答など、さまざまなタス … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

STEVE: A Step Verification Pipeline for Computer-use Agent Training

投稿日: 2025年3月25日作成者: jarxiv

要約グラフィカルユーザーインターフェイスを自律的に操作するためにAIエージェン … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

PRISM: Privacy-Preserving Improved Stochastic Masking for Federated Generative Models

投稿日: 2025年3月25日作成者: jarxiv

要約連邦学習（FL）の最近の進歩にもかかわらず、FLへの生成モデルの統合は、高 … 続きを読む →

カテゴリー: cs.CR, cs.CV, cs.LG | コメントを受け付けていません

Visual Position Prompt for MLLM based Visual Grounding

投稿日: 2025年3月25日作成者: jarxiv

要約マルチモーダルの大手言語モデル（MLLM）は、さまざまな画像関連のタスクに … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation

投稿日: 2025年3月25日作成者: jarxiv

要約 Hunyuanportraitを紹介します。これは、非常に制御可能でリアル … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Exploring the Integration of Key-Value Attention Into Pure and Hybrid Transformers for Semantic Segmentation

投稿日: 2025年3月25日作成者: jarxiv

要約 CNNは長い間画像処理の最先端と見なされていましたが、トランスアーキテクチ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders

投稿日: 2025年3月25日作成者: jarxiv

要約最近の3Dコンテンツ生成パイプラインは、一般に変分自動エンコーダー（VAE … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

MotionMap: Representing Multimodality in Human Pose Forecasting

投稿日: 2025年3月25日作成者: jarxiv

要約人間のポーズ予測は、観測されたポーズシーケンスのために複数の先物が存在する … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Curriculum Coarse-to-Fine Selection for High-IPC Dataset Distillation

投稿日: 2025年3月25日作成者: jarxiv

要約データセット蒸留（DD）は、クラスあたりの少数の画像（IPC）の合成に優れ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2025年3月

3DSwapping: Texture Swapping For 3D Object From Single Reference Image

MC-LLaVA: Multi-Concept Personalized Vision-Language Model

STEVE: A Step Verification Pipeline for Computer-use Agent Training

PRISM: Privacy-Preserving Improved Stochastic Masking for Federated Generative Models

Visual Position Prompt for MLLM based Visual Grounding

HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation

Exploring the Integration of Key-Value Attention Into Pure and Hybrid Transformers for Semantic Segmentation

Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders

MotionMap: Representing Multimodality in Human Pose Forecasting

Curriculum Coarse-to-Fine Selection for High-IPC Dataset Distillation

最近の投稿

最近のコメント

アーカイブ

カテゴリー