「cs.AI」カテゴリーアーカイブ

A Multi-Loss Strategy for Vehicle Trajectory Prediction: Combining Off-Road, Diversity, and Directional Consistency Losses

投稿日: 2024年12月2日作成者: jarxiv

要約軌道予測は、自動運転車の計画の安全性と効率性にとって不可欠です。しかし、 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.MA, cs.RO | コメントを受け付けていません

LaVIDE: A Language-Vision Discriminator for Detecting Changes in Satellite Image with Map References

投稿日: 2024年12月2日作成者: jarxiv

要約通常、二時点画像の比較に依存する変更検出は、単一の画像しか利用できない場合 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

A Survey on Multimodal Large Language Models

投稿日: 2024年12月2日作成者: jarxiv

要約最近、GPT-4V に代表されるマルチモーダル大規模言語モデル (MLLM … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Efficient Text-driven Motion Generation via Latent Consistency Training

投稿日: 2024年12月2日作成者: jarxiv

要約拡散戦略に基づくテキスト駆動の人間の動作生成は、人間とコンピューターの対話 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Quantifying the synthetic and real domain gap in aerial scene understanding

投稿日: 2024年12月2日作成者: jarxiv

要約合成画像と現実世界の画像の間のギャップを定量化することは、大量のデータに依 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

SIMS: Simulating Human-Scene Interactions with Real World Script Planning

投稿日: 2024年12月2日作成者: jarxiv

要約長期にわたるヒューマンシーンとシーンのインタラクションをシミュレートするこ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.GR | コメントを受け付けていません

VLSBench: Unveiling Visual Leakage in Multimodal Safety

投稿日: 2024年12月2日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) の安全性に関する懸念は、さま … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CR, cs.CV | コメントを受け付けていません

Reanimating Images using Neural Representations of Dynamic Stimuli

投稿日: 2024年12月2日作成者: jarxiv

要約コンピュータービジョンモデルは、静的画像認識において驚くべき進歩を遂げ … 続きを読む →

カテゴリー: cs.AI, cs.CV, q-bio.NC | コメントを受け付けていません

DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

投稿日: 2024年12月2日作成者: jarxiv

要約データセットの蒸留における最近の進歩により、2 つの主な方向での解決策が導 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Large Language Model-Brained GUI Agents: A Survey

投稿日: 2024年12月2日作成者: jarxiv

要約 GUI は長い間、人間とコンピューターの対話の中心であり、デジタルシステ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.HC | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

A Multi-Loss Strategy for Vehicle Trajectory Prediction: Combining Off-Road, Diversity, and Directional Consistency Losses

LaVIDE: A Language-Vision Discriminator for Detecting Changes in Satellite Image with Map References

A Survey on Multimodal Large Language Models

Efficient Text-driven Motion Generation via Latent Consistency Training

Quantifying the synthetic and real domain gap in aerial scene understanding

SIMS: Simulating Human-Scene Interactions with Real World Script Planning

VLSBench: Unveiling Visual Leakage in Multimodal Safety

Reanimating Images using Neural Representations of Dynamic Stimuli

DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

Large Language Model-Brained GUI Agents: A Survey

最近の投稿

最近のコメント

アーカイブ

カテゴリー