「cs.CV」カテゴリーアーカイブ

ODE: Open-Set Evaluation of Hallucinations in Multimodal Large Language Models

投稿日: 2024年12月5日作成者: jarxiv

要約幻覚は、マルチモーダル大規模言語モデル (MLLM) に永続的な課題をもた … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning

投稿日: 2024年12月5日作成者: jarxiv

要約大規模言語モデル (LLM) により、画像やビデオなどの視覚データの強力な … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

A Spatio-Temporal Representation Learning as an Alternative to Traditional Glosses in Sign Language Translation and Production

投稿日: 2024年12月5日作成者: jarxiv

要約この研究では、手話翻訳 (SLT) と手話制作 (SLP) の両方における … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

OpenDriver: An Open-Road Driver State Detection Dataset

投稿日: 2024年12月5日作成者: jarxiv

要約ドライバーの状態検出に関する数多くの研究の中で、ウェアラブル生理学的測定は … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.HC, cs.LG | コメントを受け付けていません

DIVE: Taming DINO for Subject-Driven Video Editing

投稿日: 2024年12月5日作成者: jarxiv

要約画像の生成と編集における普及モデルの成功に基づいて、ビデオ編集が最近大きな … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Intuitive Axial Augmentation Using Polar-Sine-Based Piecewise Distortion for Medical Slice-Wise Segmentation

投稿日: 2024年12月5日作成者: jarxiv

要約医療画像分析用のデータ駆動型モデルのほとんどは、パフォーマンスを向上させる … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Mapping using Transformers for Volumes — Network for Super-Resolution with Long-Range Interactions

投稿日: 2024年12月5日作成者: jarxiv

要約これまで、2D 超解像度に見られるトランスベースのモデルの最近の進歩を体積 … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Functionality understanding and segmentation in 3D scenes

投稿日: 2024年12月5日作成者: jarxiv

要約 3D シーンの機能を理解するには、自然言語の説明を解釈して、ハンドルやボタ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

LLM as a Complementary Optimizer to Gradient Descent: A Case Study in Prompt Tuning

投稿日: 2024年12月5日作成者: jarxiv

要約スキルを習得するには、一般に、実践者による実践的な経験と、メンターによる洞 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Defending Against Repetitive Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off

投稿日: 2024年12月5日作成者: jarxiv

要約半教師あり学習 (SSL) は、インターネットからの大量のラベルなしデータ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

ODE: Open-Set Evaluation of Hallucinations in Multimodal Large Language Models

AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning

A Spatio-Temporal Representation Learning as an Alternative to Traditional Glosses in Sign Language Translation and Production

OpenDriver: An Open-Road Driver State Detection Dataset

DIVE: Taming DINO for Subject-Driven Video Editing

Intuitive Axial Augmentation Using Polar-Sine-Based Piecewise Distortion for Medical Slice-Wise Segmentation

Mapping using Transformers for Volumes — Network for Super-Resolution with Long-Range Interactions

Functionality understanding and segmentation in 3D scenes

LLM as a Complementary Optimizer to Gradient Descent: A Case Study in Prompt Tuning

Defending Against Repetitive Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off

最近の投稿

最近のコメント

アーカイブ

カテゴリー