「cs.AI」カテゴリーアーカイブ

GenHOI: Generalizing Text-driven 4D Human-Object Interaction Synthesis for Unseen Objects

投稿日: 2025年6月19日作成者: jarxiv

要約拡散モデルと大規模なモーションデータセットは、テキスト駆動型のヒトモーショ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Pixel-level Certified Explanations via Randomized Smoothing

投稿日: 2025年6月19日作成者: jarxiv

要約事後帰属方法は、影響力のある入力ピクセルを強調することにより、深い学習予測 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

EgoBlind: Towards Egocentric Visual Assistance for the Blind

投稿日: 2025年6月19日作成者: jarxiv

要約視覚障害者から収集された最初のエゴセントリックビデオデータセットであるeg … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM | コメントを受け付けていません

Exploring Personalized Federated Learning Architectures for Violence Detection in Surveillance Videos

投稿日: 2025年6月19日作成者: jarxiv

要約都市監視システムにおける暴力事件を検出するという課題は、ビデオデータの膨大 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

CLAIM: Clinically-Guided LGE Augmentation for Realistic and Diverse Myocardial Scar Synthesis and Segmentation

投稿日: 2025年6月19日作成者: jarxiv

要約後期ガドリニウム増強（LGE）心臓MRIからの深い学習ベースの心筋瘢痕セグ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

TARDIS STRIDE: A Spatio-Temporal Road Image Dataset and World Model for Autonomy

投稿日: 2025年6月19日作成者: jarxiv

要約世界モデルは、環境をシミュレートし、効果的なエージェントの動作を可能にする … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution

投稿日: 2025年6月19日作成者: jarxiv

要約特に、現実的な詳細合成のために安定した拡散（SD）などの事前に訓練された生 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Vision Transformers Don’t Need Trained Registers

投稿日: 2025年6月19日作成者: jarxiv

要約視覚変圧器における以前に特定された現象の根底にあるメカニズムを調査します。 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

投稿日: 2025年6月19日作成者: jarxiv

要約分散分布（OOD）サンプルの検出は、機械学習システムの安全性を確保するため … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Demystifying the Visual Quality Paradox in Multimodal Large Language Models

投稿日: 2025年6月19日作成者: jarxiv

要約最近のマルチモーダル大手言語モデル（MLLM）は、ベンチマークビジョン言語 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

GenHOI: Generalizing Text-driven 4D Human-Object Interaction Synthesis for Unseen Objects

Pixel-level Certified Explanations via Randomized Smoothing

EgoBlind: Towards Egocentric Visual Assistance for the Blind

Exploring Personalized Federated Learning Architectures for Violence Detection in Surveillance Videos

CLAIM: Clinically-Guided LGE Augmentation for Realistic and Diverse Myocardial Scar Synthesis and Segmentation

TARDIS STRIDE: A Spatio-Temporal Road Image Dataset and World Model for Autonomy

One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution

Vision Transformers Don’t Need Trained Registers

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Demystifying the Visual Quality Paradox in Multimodal Large Language Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー