「cs.AI」カテゴリーアーカイブ

CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning

投稿日: 2024年7月23日作成者: jarxiv

要約トランスフォーマーや CLIP などのビジョン言語モデル (VLM) の出 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures

投稿日: 2024年7月23日作成者: jarxiv

要約外科用コンピュータビジョンの最近の進歩は、言語セマンティクスを欠いた視覚 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget

投稿日: 2024年7月23日作成者: jarxiv

要約生成 AI のスケーリングの法則がパフォーマンスを高めると同時に、大量の計 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning

投稿日: 2024年7月23日作成者: jarxiv

要約視覚運動ロボットに、オープンワールドの多様なシナリオで動作する汎用化機能を … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

Detecting Brittle Decisions for Free: Leveraging Margin Consistency in Deep Robust Classifiers

投稿日: 2024年7月23日作成者: jarxiv

要約堅牢性を向上させるための敵対的トレーニング戦略に関する広範な研究にもかかわ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning

投稿日: 2024年7月23日作成者: jarxiv

要約マスクイメージモデリング (MIM) は、画像のマスクされた部分から欠 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

HandDGP: Camera-Space Hand Mesh Prediction with Differentiable Global Positioning

投稿日: 2024年7月23日作成者: jarxiv

要約単一の RGB 画像からカメラ空間のハンドメッシュを予測することは、3D … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

CarFormer: Self-Driving with Learned Object-Centric Representations

投稿日: 2024年7月23日作成者: jarxiv

要約自動運転では表現の選択が重要な役割を果たします。近年、Bird&#821 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

Reconstructing Training Data From Real World Models Trained with Transfer Learning

投稿日: 2024年7月23日作成者: jarxiv

要約トレーニングされた分類器からトレーニングデータを再構成する現在の方法は、 … 続きを読む →

カテゴリー: cs.AI, cs.CR, cs.CV, cs.LG | コメントを受け付けていません

RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality Vision-Language Model

投稿日: 2024年7月23日作成者: jarxiv

要約ミニゴルフは、身体化された知性を調べるための模範的な現実世界のゲームであり … 続きを読む →

カテゴリー: cs.AI, cs.RO | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning

Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures

Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget

Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning

Detecting Brittle Decisions for Free: Leveraging Margin Consistency in Deep Robust Classifiers

Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning

HandDGP: Camera-Space Hand Mesh Prediction with Differentiable Global Positioning

CarFormer: Self-Driving with Learned Object-Centric Representations

Reconstructing Training Data From Real World Models Trained with Transfer Learning

RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality Vision-Language Model

最近の投稿

最近のコメント

アーカイブ

カテゴリー