月別アーカイブ: 2024年8月

iMatching: Imperative Correspondence Learning

投稿日: 2024年8月1日作成者: jarxiv

要約特徴の対応関係の学習はコンピュータビジョンの基礎的なタスクであり、ビジュ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs

投稿日: 2024年8月1日作成者: jarxiv

要約既存の大規模ビジョン言語モデル (LVLM) は、主にビジョンエンコーダ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining

投稿日: 2024年8月1日作成者: jarxiv

要約屋外ビジョンシステムは、雨筋や雨滴によって頻繁に汚染され、視覚タスクやマ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

An Earth Rover dataset recorded at the ICRA@40 party

投稿日: 2024年8月1日作成者: jarxiv

要約 ICRA 会議は 2024 年 9 月にロッテルダムで $40^{th}$ … 続きを読む →

カテゴリー: 68, cs.CV, cs.RO, I.4.8 | コメントを受け付けていません

The Llama 3 Herd of Models

投稿日: 2024年8月1日作成者: jarxiv

要約最新の人工知能 (AI) システムは基礎モデルを利用しています。このペー … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Vision-Language Model Based Handwriting Verification

投稿日: 2024年8月1日作成者: jarxiv

要約手書き検証は文書フォレンジックにおいて非常に重要です。深層学習ベースのア … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

PerAct2: Benchmarking and Learning for Robotic Bimanual Manipulation Tasks

投稿日: 2024年8月1日作成者: jarxiv

要約両手操作は、2 つのアーム間の正確な空間的および時間的調整が必要なため、困 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

投稿日: 2024年8月1日作成者: jarxiv

要約配布外 (OOD) サンプルの検出は、機械学習システムの安全性を確保するた … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning

投稿日: 2024年8月1日作成者: jarxiv

要約最近、大規模言語モデル (LLM) は、幅広いタスクにおいて顕著な機能を実 … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

SpaER: Learning Spatio-temporal Equivariant Representations for Fetal Brain Motion Tracking

投稿日: 2024年8月1日作成者: jarxiv

要約この論文では、等変フィルターと自己注意メカニズムを活用して時空間表現を効果 … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

月別アーカイブ: 2024年8月

iMatching: Imperative Correspondence Learning

Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs

RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining

An Earth Rover dataset recorded at the ICRA@40 party

The Llama 3 Herd of Models

Vision-Language Model Based Handwriting Verification

PerAct2: Benchmarking and Learning for Robotic Bimanual Manipulation Tasks

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning

SpaER: Learning Spatio-temporal Equivariant Representations for Fetal Brain Motion Tracking

最近の投稿

最近のコメント

アーカイブ

カテゴリー