「cs.LG」カテゴリーアーカイブ

$C^{3}$-NeRF: Modeling Multiple Scenes via Conditional-cum-Continual Neural Radiance Fields

投稿日: 2024年12月2日作成者: jarxiv

要約 Neural Radiance Field (NeRF) は、単一の 3D … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Quantifying the synthetic and real domain gap in aerial scene understanding

投稿日: 2024年12月2日作成者: jarxiv

要約合成画像と現実世界の画像の間のギャップを定量化することは、大量のデータに依 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

On Domain-Specific Post-Training for Multimodal Large Language Models

投稿日: 2024年12月2日作成者: jarxiv

要約近年、一般的なマルチモーダル大規模言語モデル (MLLM) の急速な発展が … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark

投稿日: 2024年12月2日作成者: jarxiv

要約 2023 年版の成功に続き、最先端のビデオモデルのベンチマークと測定を目 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Free-form Generation Enhances Challenging Clothed Human Modeling

投稿日: 2024年12月2日作成者: jarxiv

要約リアルなアニメーション人間アバターを実現するには、ポーズに依存する衣服の変 … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.LG | コメントを受け付けていません

DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

投稿日: 2024年12月2日作成者: jarxiv

要約データセットの蒸留における最近の進歩により、2 つの主な方向での解決策が導 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos

投稿日: 2024年12月2日作成者: jarxiv

要約 AlphaTablets は、連続的な 3D 表面と正確な境界描写を特徴と … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs

投稿日: 2024年12月2日作成者: jarxiv

要約画像領域におけるマルチモーダル大規模言語モデル (MLLM) の成功は、研 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

MM-Path: Multi-modal, Multi-granularity Path Representation Learning — Extended Version

投稿日: 2024年12月2日作成者: jarxiv

要約効果的な経路表現の開発は、インテリジェント交通のさまざまな分野でますます重 … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Metric-DST: Mitigating Selection Bias Through Diversity-Guided Semi-Supervised Metric Learning

投稿日: 2024年12月2日作成者: jarxiv

要約選択バイアスは、母集団をあまり代表しないデータでトレーニングされたモデルが … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

「cs.LG」カテゴリーアーカイブ

$C^{3}$-NeRF: Modeling Multiple Scenes via Conditional-cum-Continual Neural Radiance Fields

Quantifying the synthetic and real domain gap in aerial scene understanding

On Domain-Specific Post-Training for Multimodal Large Language Models

Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark

Free-form Generation Enhances Challenging Clothed Human Modeling

DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos

T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs

MM-Path: Multi-modal, Multi-granularity Path Representation Learning — Extended Version

Metric-DST: Mitigating Selection Bias Through Diversity-Guided Semi-Supervised Metric Learning

最近の投稿

最近のコメント

アーカイブ

カテゴリー