「cs.CV」カテゴリーアーカイブ

Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models

投稿日: 2025年6月9日作成者: jarxiv

要約マルチモーダルの大手言語モデルの最近の進歩は、視覚的な質問に応答するブレー … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models

投稿日: 2025年6月9日作成者: jarxiv

要約このペーパーでは、大規模なデータとモデルの時代における信頼できるガードレー … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models

投稿日: 2025年6月9日作成者: jarxiv

要約 AIの安全性にとって解釈可能性と操縦性が重要であることを考えると、スパース … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision

投稿日: 2025年6月9日作成者: jarxiv

要約エゴセントリック（一人称）とエクソセントリック（サードパーソン）の両方の視 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

DPCore: Dynamic Prompt Coreset for Continual Test-Time Adaptation

投稿日: 2025年6月9日作成者: jarxiv

要約継続的なテスト時間適応（CTTA）は、事前に訓練されたモデルを継続的に変化 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models

投稿日: 2025年6月9日作成者: jarxiv

要約 AIが人々の日常活動とより密接に統合されるようになるにつれて、日常生活で人 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Normalizing Flows are Capable Generative Models

投稿日: 2025年6月9日作成者: jarxiv

要約正規化フロー（NFS）は、連続入力の尤度ベースのモデルです。彼らは、密度 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Sketched Equivariant Imaging Regularization and Deep Internal Learning for Inverse Problems

投稿日: 2025年6月9日作成者: jarxiv

要約 Equivariant Imaging（EI）の正則化は、地上の真実データ … 続きを読む →

カテゴリー: cs.CV, cs.LG, eess.IV, math.OC | コメントを受け付けていません

Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks

投稿日: 2025年6月9日作成者: jarxiv

要約テキストが全体的な理解を導く中心的な視覚要素として機能するテキストが豊富な … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

BecomingLit: Relightable Gaussian Avatars with Hybrid Neural Shading

投稿日: 2025年6月9日作成者: jarxiv

要約 Interactiveレートで新しい視点からレンダリングできる、信頼性の高 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models

LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models

Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models

Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision

DPCore: Dynamic Prompt Coreset for Continual Test-Time Adaptation

MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models

Normalizing Flows are Capable Generative Models

Sketched Equivariant Imaging Regularization and Deep Internal Learning for Inverse Problems

Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks

BecomingLit: Relightable Gaussian Avatars with Hybrid Neural Shading

最近の投稿

最近のコメント

アーカイブ

カテゴリー