月別アーカイブ: 2024年8月

Evaluation Framework for Feedback Generation Methods in Skeletal Movement Assessment

投稿日: 2024年8月30日作成者: jarxiv

要約スケルトンビデオからの動作評価への機械学習ソリューションの応用は、近年、研 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation

投稿日: 2024年8月30日作成者: jarxiv

要約大規模なビジョン言語モデル (GPT-4、LLaVA など) におけるよく … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models

投稿日: 2024年8月30日作成者: jarxiv

要約大規模マルチモーダルモデル (LMM) は、多くの視覚的なタスクにわたっ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Dissecting Out-of-Distribution Detection and Open-Set Recognition: A Critical Analysis of Methods and Benchmarks

投稿日: 2024年8月30日作成者: jarxiv

要約テスト時間の分布シフトの検出は、機械学習モデルを安全に導入するための重要な … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation

投稿日: 2024年8月30日作成者: jarxiv

要約視覚モデルの領域では、主な表現モードはピクセルを使用して視覚世界をラスタラ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

OmniRe: Omni Urban Scene Reconstruction

投稿日: 2024年8月30日作成者: jarxiv

要約オンデバイスのログから高忠実度のダイナミックな都市シーンを効率的に再構築す … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

UV-free Texture Generation with Denoising and Geodesic Heat Diffusions

投稿日: 2024年8月30日作成者: jarxiv

要約継ぎ目、歪み、無駄な UV スペース、頂点の重複、およびサーフェス上のさま … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.LG | コメントを受け付けていません

CSGO: Content-Style Composition in Text-to-Image Generation

投稿日: 2024年8月30日作成者: jarxiv

要約拡散モデルは、制御された画像生成において優れた能力を示しており、それが画像 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model

投稿日: 2024年8月30日作成者: jarxiv

要約 3D シーン再構成の進歩により、現実世界の 2D 画像が 3D モデルに変 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR | コメントを受け付けていません

PromptSmooth: Certifying Robustness of Medical Vision-Language Models via Prompt Learning

投稿日: 2024年8月30日作成者: jarxiv

要約医療画像とテキストのペアの大規模なデータセットでトレーニングされ、後で特定 … 続きを読む →

カテゴリー: cs.CR, cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年8月

Evaluation Framework for Feedback Generation Methods in Skeletal Movement Assessment

VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation

GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models

Dissecting Out-of-Distribution Detection and Open-Set Recognition: A Critical Analysis of Methods and Benchmarks

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation

OmniRe: Omni Urban Scene Reconstruction

UV-free Texture Generation with Denoising and Geodesic Heat Diffusions

CSGO: Content-Style Composition in Text-to-Image Generation

ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model

PromptSmooth: Certifying Robustness of Medical Vision-Language Models via Prompt Learning

最近の投稿

最近のコメント

アーカイブ

カテゴリー