「cs.LG」カテゴリーアーカイブ

Detection and Geographic Localization of Natural Objects in the Wild: A Case Study on Palms

投稿日: 2025年2月19日作成者: jarxiv

要約手のひらは、熱帯の森林の健康、生物多様性、および地元の経済と世界の森林製品 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Natural Language Generation from Visual Sequences: Challenges and Future Directions

投稿日: 2025年2月19日作成者: jarxiv

要約自然言語を使用して視覚コンテンツについて話す能力は、人間の知能の中核であり … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

LieRE: Generalizing Rotary Position Encodings

投稿日: 2025年2月19日作成者: jarxiv

要約トランスアーキテクチャは、トークンの依存関係をキャプチャするために位置エン … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Improved Fine-Tuning of Large Multimodal Models for Hateful Meme Detection

投稿日: 2025年2月19日作成者: jarxiv

要約憎しみのあるミームはインターネット上の重要な懸念となっており、堅牢な自動検 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

BenthicNet: A global compilation of seafloor images for deep learning applications

投稿日: 2025年2月19日作成者: jarxiv

要約水中イメージングの進歩により、重要な底生生態系の監視に必要な広範な海底画像 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Understanding and Rectifying Safety Perception Distortion in VLMs

投稿日: 2025年2月19日作成者: jarxiv

要約最近の研究では、ビジョンモデル（VLM）がビジョンモダリティを統合した後、 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Magma: A Foundation Model for Multimodal AI Agents

投稿日: 2025年2月19日作成者: jarxiv

要約マグマは、デジタルワールドと物理世界の両方でマルチモーダルAIエージェント … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.HC, cs.LG, cs.RO | コメントを受け付けていません

Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization

投稿日: 2025年2月19日作成者: jarxiv

要約大型ビジョン言語モデル（VLMS）の出現により、視覚的モダリティを統合する … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Scaling Test-Time Compute Without Verification or RL is Suboptimal

投稿日: 2025年2月19日作成者: jarxiv

要約テスト時間計算のスケーリングに大きな進歩にもかかわらず、コミュニティで継続 … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Manifold Learning with Sparse Regularised Optimal Transport

投稿日: 2025年2月18日作成者: jarxiv

要約マニホールド学習は、現代の統計とデータサイエンスの中心的なタスクです。多 … 続きを読む →

カテゴリー: 62R30, 68T01, cs.LG, math.ST, stat.ML, stat.TH | コメントを受け付けていません

「cs.LG」カテゴリーアーカイブ

Detection and Geographic Localization of Natural Objects in the Wild: A Case Study on Palms

Natural Language Generation from Visual Sequences: Challenges and Future Directions

LieRE: Generalizing Rotary Position Encodings

Improved Fine-Tuning of Large Multimodal Models for Hateful Meme Detection

BenthicNet: A global compilation of seafloor images for deep learning applications

Understanding and Rectifying Safety Perception Distortion in VLMs

Magma: A Foundation Model for Multimodal AI Agents

Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization

Scaling Test-Time Compute Without Verification or RL is Suboptimal

Manifold Learning with Sparse Regularised Optimal Transport

最近の投稿

最近のコメント

アーカイブ

カテゴリー