投稿者「jarxiv」のアーカイブ

Effective Dual-Region Augmentation for Reduced Reliance on Large Amounts of Labeled Data

投稿日: 2025年4月18日作成者: jarxiv

要約このペーパーでは、大規模なラベル付きデータセットへの依存を減らし、ソースフ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Benchmarking the Spatial Robustness of DNNs via Natural and Adversarial Localized Corruptions

投稿日: 2025年4月18日作成者: jarxiv

要約 DNNSの堅牢性は、特に局所的な腐敗が発生する可能性のある複雑で動的な環境 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Enhancing Person-to-Person Virtual Try-On with Multi-Garment Virtual Try-Off

投稿日: 2025年4月18日作成者: jarxiv

要約コンピュータービジョンは、Virtual Try-On（VTON）と仮想ト … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

EventVAD: Training-Free Event-Aware Video Anomaly Detection

投稿日: 2025年4月18日作成者: jarxiv

要約ビデオアノマリー検出〜（VAD）は、ビデオ内の異常の識別に焦点を当てていま … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

RF-DETR Object Detection vs YOLOv12 : A Study of Transformer-based and CNN-based Architectures for Single-Class and Multi-Class Greenfruit Detection in Complex Orchard Environments Under Label Ambiguity

投稿日: 2025年4月18日作成者: jarxiv

要約この研究では、ラベルのあいまいさ、オクルージョン、およびバックグラウンドブ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Multimodal LLMs Can Reason about Aesthetics in Zero-Shot

投稿日: 2025年4月18日作成者: jarxiv

要約生成芸術の急速な進歩は、視覚的に心地よいイメージの作成を民主化しました。 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.MM | コメントを受け付けていません

UniEdit-Flow: Unleashing Inversion and Editing in the Era of Flow Models

投稿日: 2025年4月18日作成者: jarxiv

要約フローマッチングモデルは、拡散モデルの強力な代替品として浮上していますが、 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Probing and Inducing Combinational Creativity in Vision-Language Models

投稿日: 2025年4月18日作成者: jarxiv

要約既存の概念を斬新なアイデアに組み合わせる能力は、人間の知性の基本的な特徴と … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

投稿日: 2025年4月18日作成者: jarxiv

要約大規模な言語モデル（LLM）に基づいて構築された大規模なビデオモデル（LV … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training

投稿日: 2025年4月18日作成者: jarxiv

要約近年、ビジョン言語モデルのプリトレーニングの分野は、主に大規模な言語モデル … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

Effective Dual-Region Augmentation for Reduced Reliance on Large Amounts of Labeled Data

Benchmarking the Spatial Robustness of DNNs via Natural and Adversarial Localized Corruptions

Enhancing Person-to-Person Virtual Try-On with Multi-Garment Virtual Try-Off

EventVAD: Training-Free Event-Aware Video Anomaly Detection

RF-DETR Object Detection vs YOLOv12 : A Study of Transformer-based and CNN-based Architectures for Single-Class and Multi-Class Greenfruit Detection in Complex Orchard Environments Under Label Ambiguity

Multimodal LLMs Can Reason about Aesthetics in Zero-Shot

UniEdit-Flow: Unleashing Inversion and Editing in the Era of Flow Models

Probing and Inducing Combinational Creativity in Vision-Language Models

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training

最近の投稿

最近のコメント

アーカイブ

カテゴリー