「cs.AI」カテゴリーアーカイブ

Hierarchical localization with panoramic views and triplet loss functions

投稿日: 2024年11月25日作成者: jarxiv

要約この論文の主な目的は、移動ロボットの安全なナビゲーションに不可欠な視覚的位 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

Semantically-Prompted Language Models Improve Visual Descriptions

投稿日: 2024年11月25日作成者: jarxiv

要約 CLIP のような言語視覚モデルは、ゼロショット画像分類 (ZSIC) な … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Controlling Language and Diffusion Models by Transporting Activations

投稿日: 2024年11月25日作成者: jarxiv

要約大規模な生成モデルの機能が向上し、その導入がますます広範囲に行われるように … 続きを読む →

カテゴリー: 49Q22, 68T07, cs.AI, cs.CL, cs.CV, cs.LG, I.2.6 | コメントを受け付けていません

OminiControl: Minimal and Universal Control for Diffusion Transformer

投稿日: 2024年11月25日作成者: jarxiv

要約このペーパーでは、画像条件を事前トレーニング済みの拡散変換 (DiT) モ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

About Time: Advances, Challenges, and Outlooks of Action Understanding

投稿日: 2024年11月25日作成者: jarxiv

要約私たちは、ビデオアクションの理解における目覚ましい進歩を目の当たりにしてき … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion

投稿日: 2024年11月25日作成者: jarxiv

要約テキストから画像へのモデルがますます強力かつ複雑になるにつれて、そのサイズ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

投稿日: 2024年11月25日作成者: jarxiv

要約最近のテキストからビデオへの (T2V) 普及モデルは、さまざまなドメイン … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

ReXrank: A Public Leaderboard for AI-Powered Radiology Report Generation

投稿日: 2024年11月25日作成者: jarxiv

要約 AI 駆動モデルは、胸部 X 線検査の放射線レポート生成の自動化において大 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Health AI Developer Foundations

投稿日: 2024年11月25日作成者: jarxiv

要約堅牢な医療機械学習 (ML) モデルは、臨床研究を加速し、ワークフローと結 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.MM, eess.IV | コメントを受け付けていません

t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving

投稿日: 2024年11月22日作成者: jarxiv

要約自動運転車 (AV) によるマルチモーダルセンサー (カメラ、ライダー、 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.DC, cs.LG, cs.RO | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

Hierarchical localization with panoramic views and triplet loss functions

Semantically-Prompted Language Models Improve Visual Descriptions

Controlling Language and Diffusion Models by Transporting Activations

OminiControl: Minimal and Universal Control for Diffusion Transformer

About Time: Advances, Challenges, and Outlooks of Action Understanding

Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

ReXrank: A Public Leaderboard for AI-Powered Radiology Report Generation

Health AI Developer Foundations

t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving

最近の投稿

最近のコメント

アーカイブ

カテゴリー