「cs.LG」カテゴリーアーカイブ

A Practitioner’s Guide to Continual Multimodal Pretraining

投稿日: 2024年12月9日作成者: jarxiv

要約マルチモーダル基盤モデルは、視覚と言語の交差点で数多くのアプリケーションに … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

CompCap: Improving Multimodal Large Language Models with Composite Captions

投稿日: 2024年12月9日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) は合成画像をどの程度理解でき … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

From classical techniques to convolution-based models: A review of object detection algorithms

投稿日: 2024年12月9日作成者: jarxiv

要約オブジェクト検出は、コンピュータービジョンと画像理解における基本的なタス … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Extrapolated Urban View Synthesis Benchmark

投稿日: 2024年12月9日作成者: jarxiv

要約フォトリアリスティックなシミュレーターは、ビジョン中心の自動運転車 (AV … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Sparse autoencoders reveal selective remapping of visual concepts during adaptation

投稿日: 2024年12月9日作成者: jarxiv

要約基礎モデルを特定の目的に適合させることは、下流アプリケーション用の機械学習 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model

投稿日: 2024年12月9日作成者: jarxiv

要約現実的な自動運転シミュレーターの開発には4D運転シミュレーションが不可欠で … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding

投稿日: 2024年12月9日作成者: jarxiv

要約 3D 占有予測は周囲のシーンの包括的な説明を提供し、3D 認識にとって不可 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

GaussianFormer-2: Probabilistic Gaussian Superposition for Efficient 3D Occupancy Prediction

投稿日: 2024年12月9日作成者: jarxiv

要約 3D セマンティック占有予測は、周囲のシーンのきめ細かいジオメトリとセマン … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

From interpretability to inference: an estimation framework for universal approximators

投稿日: 2024年12月8日作成者: jarxiv

要約私たちは、広範なクラスの汎用近似器を使用した推定と推論のための新しいフレー … 続きを読む →

カテゴリー: 62-07, 62G10, 62G20, 91-08, 91A12, cs.LG, econ.EM, G.3, stat.ML | コメントを受け付けていません

Learning Speed-Adaptive Walking Agent Using Imitation Learning with Physics-Informed Simulation

投稿日: 2024年12月6日作成者: jarxiv

要約人間の歩行の仮想モデル、つまりデジタルツインは、労働集約的なデータ収集を … 続きを読む →

カテゴリー: cs.LG, cs.RO | コメントを受け付けていません

「cs.LG」カテゴリーアーカイブ

A Practitioner’s Guide to Continual Multimodal Pretraining

CompCap: Improving Multimodal Large Language Models with Composite Captions

From classical techniques to convolution-based models: A review of object detection algorithms

Extrapolated Urban View Synthesis Benchmark

Sparse autoencoders reveal selective remapping of visual concepts during adaptation

Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model

EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding

GaussianFormer-2: Probabilistic Gaussian Superposition for Efficient 3D Occupancy Prediction

From interpretability to inference: an estimation framework for universal approximators

Learning Speed-Adaptive Walking Agent Using Imitation Learning with Physics-Informed Simulation

最近の投稿

最近のコメント

アーカイブ

カテゴリー