「cs.CV」カテゴリーアーカイブ

VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks

投稿日: 2024年10月18日作成者: jarxiv

要約異種入力 (画像、テキスト、音声など) から推論を導き出すことは、人間が日 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

PTQ4DiT: Post-training Quantization for Diffusion Transformers

投稿日: 2024年10月18日作成者: jarxiv

要約最近導入された拡散トランスフォーマー (DiT) は、従来の U-Net … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion

投稿日: 2024年10月18日作成者: jarxiv

要約低品質または希少なデータは、実際にディープニューラルネットワークをトレ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Pose-Based Sign Language Appearance Transfer

投稿日: 2024年10月18日作成者: jarxiv

要約手話の内容を保持したまま、手話の骨格ポーズで署名者の外観を転送する方法を紹 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

FlashTex: Fast Relightable Mesh Texturing with LightControlNet

投稿日: 2024年10月18日作成者: jarxiv

要約 3D メッシュのテクスチャを手動で作成するのは、熟練したビジュアルコンテ … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.LG | コメントを受け付けていません

Label-free prediction of fluorescence markers in bovine satellite cells using deep learning

投稿日: 2024年10月18日作成者: jarxiv

要約ウシ衛星細胞 (BSC) の品質を評価することは、世界的な食料の持続可能性 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Beyond Coarse-Grained Matching in Video-Text Retrieval

投稿日: 2024年10月18日作成者: jarxiv

要約ビデオテキストの検索は大幅に進歩しましたが、キャプションの微妙な違いを識別 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.MM | コメントを受け付けていません

Exploring the Design Space of Visual Context Representation in Video MLLMs

投稿日: 2024年10月18日作成者: jarxiv

要約ビデオマルチモーダル大規模言語モデル (MLLM) は、さまざまな下流タ … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

LieRE: Generalizing Rotary Position Encodings

投稿日: 2024年10月18日作成者: jarxiv

要約大規模な言語モデルに対するロータリーポジションエンベディング (RoP … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Comprehensive Performance Evaluation of YOLO11, YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments

投稿日: 2024年10月18日作成者: jarxiv

要約この研究では、商業果樹園における緑色の果物の検出のために、YOLOv8、Y … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks

PTQ4DiT: Post-training Quantization for Diffusion Transformers

Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion

Pose-Based Sign Language Appearance Transfer

FlashTex: Fast Relightable Mesh Texturing with LightControlNet

Label-free prediction of fluorescence markers in bovine satellite cells using deep learning

Beyond Coarse-Grained Matching in Video-Text Retrieval

Exploring the Design Space of Visual Context Representation in Video MLLMs

LieRE: Generalizing Rotary Position Encodings

Comprehensive Performance Evaluation of YOLO11, YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments

最近の投稿

最近のコメント

アーカイブ

カテゴリー