「cs.LG」カテゴリーアーカイブ

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

投稿日: 2025年1月31日作成者: jarxiv

要約ビジョン言語モデル（VLM）は最近、ロボットアクションを生成するために活用 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Temporal Preference Optimization for Long-Form Video Understanding

投稿日: 2025年1月31日作成者: jarxiv

要約ビデオの大規模なマルチモーダルモデル（ビデオLMMS）の大幅な進歩にもかか … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models

投稿日: 2025年1月31日作成者: jarxiv

要約実際のシナリオでは、モデルが未知のターゲット分布に適応または一般化する必要 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Diffusion Autoencoders are Scalable Image Tokenizers

投稿日: 2025年1月31日作成者: jarxiv

要約画像をコンパクトな視覚表現にトークン化することは、効率的で高品質の画像生成 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models

投稿日: 2025年1月31日作成者: jarxiv

要約特定の医療スキャンで正確な病理学的領域を局在することは、従来、大量の境界の … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Improving Privacy Benefits of Redaction

投稿日: 2025年1月31日作成者: jarxiv

要約自然なテキストデータを消毒するために使用できる新しい編集方法論を提案します … 続きを読む →

カテゴリー: cs.CL, cs.CR, cs.LG | コメントを受け付けていません

Computing the gradients with respect to all parameters of a quantum neural network using a single circuit

投稿日: 2025年1月31日作成者: jarxiv

要約勾配を見つけることは、機械学習モデルをトレーニングする上で重要なステップで … 続きを読む →

カテゴリー: cs.AI, cs.LG, quant-ph | コメントを受け付けていません

Boosting Weak Positives for Text Based Person Search

投稿日: 2025年1月31日作成者: jarxiv

要約大規模なビジョン言語モデルは、クロスモーダルオブジェクトの検索に革命をもた … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

RLPP: A Residual Method for Zero-Shot Real-World Autonomous Racing on Scaled Platforms

投稿日: 2025年1月30日作成者: jarxiv

要約自律的なレースは、動的な条件下で迅速な決定を下すことができる堅牢なコントロ … 続きを読む →

カテゴリー: 68T40, cs.LG, cs.RO | コメントを受け付けていません

Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey

投稿日: 2025年1月30日作成者: jarxiv

要約マルチモーダルビジョン言語モデル（VLM）は、コンピュータービジョンと自然 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

「cs.LG」カテゴリーアーカイブ

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

Temporal Preference Optimization for Long-Form Video Understanding

Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models

Diffusion Autoencoders are Scalable Image Tokenizers

Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models

Improving Privacy Benefits of Redaction

Computing the gradients with respect to all parameters of a quantum neural network using a single circuit

Boosting Weak Positives for Text Based Person Search

RLPP: A Residual Method for Zero-Shot Real-World Autonomous Racing on Scaled Platforms

Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey

最近の投稿

最近のコメント

アーカイブ

カテゴリー