月別アーカイブ: 2024年3月

Semantic Layering in Room Segmentation via LLMs

投稿日: 2024年3月20日作成者: jarxiv

要約このペーパーでは、ラージ言語モデル (LLM) と従来の 2D マップベー … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Contextual AD Narration with Interleaved Multimodal Sequence

投稿日: 2024年3月20日作成者: jarxiv

要約オーディオディスクリプション (AD) タスクは、視覚障害のある人が映画 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

You Only Sample Once: Taming One-Step Text-To-Image Synthesis by Self-Cooperative Diffusion GANs

投稿日: 2024年3月20日作成者: jarxiv

要約 YOSO は、迅速かつスケーラブルで忠実度の高いワンステップ画像合成用に設 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Best of Both Worlds: Hybrid SNN-ANN Architecture for Event-based Optical Flow Estimation

投稿日: 2024年3月20日作成者: jarxiv

要約ロボット工学の分野では、イベントベースのカメラが、高速モーションや高ダイナ … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Zero-Reference Low-Light Enhancement via Physical Quadruple Priors

投稿日: 2024年3月20日作成者: jarxiv

要約照明を理解し、監視の必要性を減らすことは、低照度の強調において大きな課題と … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Segment Anything for comprehensive analysis of grapevine cluster architecture and berry properties

投稿日: 2024年3月20日作成者: jarxiv

要約ブドウの房の構造と緻密さは、病気のかかりやすさ、果実の品質、収量に影響を与 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

EscherNet: A Generative Model for Scalable View Synthesis

投稿日: 2024年3月20日作成者: jarxiv

要約ビュー合成用のマルチビュー条件付き拡散モデルである EscherNet を … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback

投稿日: 2024年3月20日作成者: jarxiv

要約我々は、大規模言語モデルからの自然言語フィードバック (NLF) を革新的 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Resolution- and Stimulus-agnostic Super-Resolution of Ultra-High-Field Functional MRI: Application to Visual Studies

投稿日: 2024年3月20日作成者: jarxiv

要約高解像度 fMRI は、脳の中規模組織への窓を提供します。しかし、空間解 … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models

投稿日: 2024年3月20日作成者: jarxiv

要約ビジョン言語モデル (VLM) の進歩により、特にゼロショット学習設定にお … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

月別アーカイブ: 2024年3月

Semantic Layering in Room Segmentation via LLMs

Contextual AD Narration with Interleaved Multimodal Sequence

You Only Sample Once: Taming One-Step Text-To-Image Synthesis by Self-Cooperative Diffusion GANs

Best of Both Worlds: Hybrid SNN-ANN Architecture for Event-based Optical Flow Estimation

Zero-Reference Low-Light Enhancement via Physical Quadruple Priors

Segment Anything for comprehensive analysis of grapevine cluster architecture and berry properties

EscherNet: A Generative Model for Scalable View Synthesis

DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback

Resolution- and Stimulus-agnostic Super-Resolution of Ultra-High-Field Functional MRI: Application to Visual Studies

Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー