「cs.CV」カテゴリーアーカイブ

DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?

投稿日: 2025年5月23日作成者: jarxiv

要約最近のテキストからイメージ（T2I）モデルは、簡単な説明から画像を合成する … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Backdoor Cleaning without External Guidance in MLLM Fine-tuning

投稿日: 2025年5月23日作成者: jarxiv

要約マルチモーダル大手言語モデル（MLLMS）は、ユーザーがサビされたデータセ … 続きを読む →

カテゴリー: cs.CR, cs.CV | コメントを受け付けていません

L2RDaS: Synthesizing 4D Radar Tensors for Model Generalization via Dataset Expansion

投稿日: 2025年5月23日作成者: jarxiv

要約 4次元（4D）レーダーは、有害な気象条件下での堅牢性により、知覚タスクの自 … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

投稿日: 2025年5月23日作成者: jarxiv

要約この作業では、現在のマルチモーダルアプローチで支配的な自己網性パラダイムか … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

NovelSeek: When Agent Becomes the Scientist — Building Closed-Loop System from Hypothesis to Verification

投稿日: 2025年5月23日作成者: jarxiv

要約人工知能（AI）は、科学研究のパラダイムの変換を加速し、研究効率を高めるだ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Efficient Correlation Volume Sampling for Ultra-High-Resolution Optical Flow Estimation

投稿日: 2025年5月23日作成者: jarxiv

要約最近の光フロー推定方法は、しばしば密な全ペア相関ボリュームからのローカルコ … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Motion by Queries: Identity-Motion Trade-offs in Text-to-Video Generation

投稿日: 2025年5月23日作成者: jarxiv

要約テキスト間拡散モデルは、テキストの説明からコヒーレントなビデオクリップを生 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

MedFrameQA: A Multi-Image Medical VQA Benchmark for Clinical Reasoning

投稿日: 2025年5月23日作成者: jarxiv

要約既存の医療用VQAベンチマークは、主に単一イメージ分析に焦点を当てています … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability

投稿日: 2025年5月23日作成者: jarxiv

要約 Vision Transformers（VITS）は、多くの安全性クリティ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

UniPhy: Learning a Unified Constitutive Model for Inverse Physics Simulation

投稿日: 2025年5月23日作成者: jarxiv

要約私たちは、多様な材料の物理的特性をコードすることができる一般的な潜在的な神 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?

Backdoor Cleaning without External Guidance in MLLM Fine-tuning

L2RDaS: Synthesizing 4D Radar Tensors for Model Generalization via Dataset Expansion

LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

NovelSeek: When Agent Becomes the Scientist — Building Closed-Loop System from Hypothesis to Verification

Efficient Correlation Volume Sampling for Ultra-High-Resolution Optical Flow Estimation

Motion by Queries: Identity-Motion Trade-offs in Text-to-Video Generation

MedFrameQA: A Multi-Image Medical VQA Benchmark for Clinical Reasoning

Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability

UniPhy: Learning a Unified Constitutive Model for Inverse Physics Simulation

最近の投稿

最近のコメント

アーカイブ

カテゴリー