「cs.AI」カテゴリーアーカイブ

Enhancing Person-to-Person Virtual Try-On with Multi-Garment Virtual Try-Off

投稿日: 2025年4月18日作成者: jarxiv

要約コンピュータービジョンは、Virtual Try-On（VTON）と仮想ト … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Multimodal LLMs Can Reason about Aesthetics in Zero-Shot

投稿日: 2025年4月18日作成者: jarxiv

要約生成芸術の急速な進歩は、視覚的に心地よいイメージの作成を民主化しました。 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.MM | コメントを受け付けていません

Probing and Inducing Combinational Creativity in Vision-Language Models

投稿日: 2025年4月18日作成者: jarxiv

要約既存の概念を斬新なアイデアに組み合わせる能力は、人間の知性の基本的な特徴と … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training

投稿日: 2025年4月18日作成者: jarxiv

要約近年、ビジョン言語モデルのプリトレーニングの分野は、主に大規模な言語モデル … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Science-T2I: Addressing Scientific Illusions in Image Synthesis

投稿日: 2025年4月18日作成者: jarxiv

要約科学的知識を生成モデルに統合し、画像統合のリアリズムと一貫性を高めるための … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results

投稿日: 2025年4月18日作成者: jarxiv

要約このペーパーでは、短編UGCビデオ品質評価と強化に関するNTIRE 202 … 続きを読む →

カテゴリー: cs.AI, cs.CV, eess.IV | コメントを受け付けていません

$\texttt{Complex-Edit}$: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark

投稿日: 2025年4月18日作成者: jarxiv

要約さまざまな複雑さの指示にわたって命令ベースの画像編集モデルを体系的に評価す … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Readable Twins of Unreadable Models

投稿日: 2025年4月18日作成者: jarxiv

要約責任ある人工知能（AI）システムの作成は、AIの作品の現代の研究開発におけ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

投稿日: 2025年4月18日作成者: jarxiv

要約ビジョン言語モデルはコンピュータービジョンの研究に不可欠ですが、多くの高性 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Causality-enhanced Decision-Making for Autonomous Mobile Robots in Dynamic Environments

投稿日: 2025年4月18日作成者: jarxiv

要約倉庫、ショッピングセンター、病院などの共有環境でのロボットの統合の拡大は、 … 続きを読む →

カテゴリー: cs.AI, cs.RO | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

Enhancing Person-to-Person Virtual Try-On with Multi-Garment Virtual Try-Off

Multimodal LLMs Can Reason about Aesthetics in Zero-Shot

Probing and Inducing Combinational Creativity in Vision-Language Models

Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training

Science-T2I: Addressing Scientific Illusions in Image Synthesis

NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results

$\texttt{Complex-Edit}$: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark

Readable Twins of Unreadable Models

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

Causality-enhanced Decision-Making for Autonomous Mobile Robots in Dynamic Environments

最近の投稿

最近のコメント

アーカイブ

カテゴリー