「cs.LG」カテゴリーアーカイブ

A Theoretical Analysis of Soft-Label vs Hard-Label Training in Neural Networks

投稿日: 2024年12月13日作成者: jarxiv

要約小規模の学生モデルが事前トレーニングされた大規模な教師モデルから学習する知 … 続きを読む →

カテゴリー: 68T01, cs.AI, cs.LG | コメントを受け付けていません

Disentangling Mean Embeddings for Better Diagnostics of Image Generators

投稿日: 2024年12月13日作成者: jarxiv

要約画像ジェネレーターの評価は、特定の画像領域に対する微妙な洞察を提供する際の … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing

投稿日: 2024年12月13日作成者: jarxiv

要約 SimAvatar は、テキストプロンプトからシミュレーション対応の服を … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.LG | コメントを受け付けていません

Localizing Memorization in SSL Vision Encoders

投稿日: 2024年12月13日作成者: jarxiv

要約自己教師あり学習 (SSL) における記憶に関する研究に関する最近の研究で … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Neptune: The Long Orbit to Benchmarking Long Video Understanding

投稿日: 2024年12月13日作成者: jarxiv

要約このペーパーでは、長いビデオを理解するための難しい質問、回答、おとりのセッ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living

投稿日: 2024年12月13日作成者: jarxiv

要約 Web ビデオでトレーニングされた現在の大規模言語視覚モデル (LLVM) … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Owl-1: Omni World Model for Consistent Long Video Generation

投稿日: 2024年12月13日作成者: jarxiv

要約ビデオ生成モデル (VGM) は最近大きな注目を集めており、汎用大型ビジョ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Hidden Biases of End-to-End Driving Datasets

投稿日: 2024年12月13日作成者: jarxiv

要約エンドツーエンドの駆動システムは急速に進歩していますが、これまでのところ、 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Spectral Image Tokenizer

投稿日: 2024年12月13日作成者: jarxiv

要約画像トークナイザーは、画像を離散トークンのシーケンスにマッピングし、自己回 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Doe-1: Closed-Loop Autonomous Driving with Large World Model

投稿日: 2024年12月13日作成者: jarxiv

要約エンドツーエンドの自動運転は、大量のデータから学習できる可能性があるため、 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

「cs.LG」カテゴリーアーカイブ

A Theoretical Analysis of Soft-Label vs Hard-Label Training in Neural Networks

Disentangling Mean Embeddings for Better Diagnostics of Image Generators

SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing

Localizing Memorization in SSL Vision Encoders

Neptune: The Long Orbit to Benchmarking Long Video Understanding

LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living

Owl-1: Omni World Model for Consistent Long Video Generation

Hidden Biases of End-to-End Driving Datasets

Spectral Image Tokenizer

Doe-1: Closed-Loop Autonomous Driving with Large World Model

最近の投稿

最近のコメント

アーカイブ

カテゴリー