月別アーカイブ: 2024年8月

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

投稿日: 2024年8月21日作成者: jarxiv

要約ロングコンテキスト機能は、マルチモーダル基盤モデルにとって重要です。シス … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Identifying Query-Relevant Neurons in Large Language Models for Long-Form Texts

投稿日: 2024年8月21日作成者: jarxiv

要約大規模言語モデル (LLM) はパラメータ内に膨大な量の知識を保持している … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

‘Image, Tell me your story!’ Predicting the original meta-context of visual misinformation

投稿日: 2024年8月21日作成者: jarxiv

要約人間のファクトチェッカーを支援するために、研究者たちは視覚的な誤情報検出の … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

PLUTUS: A Well Pre-trained Large Unified Transformer can Unveil Financial Time Series Regularities

投稿日: 2024年8月21日作成者: jarxiv

要約金融時系列モデリングは市場の動きを理解して予測するために重要ですが、非線形 … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Constructing Domain-Specific Evaluation Sets for LLM-as-a-judge

投稿日: 2024年8月21日作成者: jarxiv

要約大規模言語モデル (LLM) は機械学習の状況に革命をもたらしましたが、現 … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

LoopSplat: Loop Closure by Registering 3D Gaussian Splats

投稿日: 2024年8月21日作成者: jarxiv

要約 3D ガウススプラット (3DGS) に基づく同時ローカライゼーションと … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

AdaResNet: Enhancing Residual Networks with Dynamic Weight Adjustment for Improved Feature Integration

投稿日: 2024年8月20日作成者: jarxiv

要約非常に深いニューラルネットワークでは、バックプロパゲーション中に勾配が非 … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields

投稿日: 2024年8月20日作成者: jarxiv

要約 3D コンピュータビジョンタスクの進歩には膨大な量のデータが必要ですが … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Intuitive Human-Robot Interface: A 3-Dimensional Action Recognition and UAV Collaboration Framework

投稿日: 2024年8月20日作成者: jarxiv

要約人間の動きを利用して無人航空機 (UAV) を制御することは、その配備に革 … 続きを読む →

カテゴリー: cs.RO | コメントを受け付けていません

V2X-VLM: End-to-End V2X Cooperative Autonomous Driving Through Large Vision-Language Models

投稿日: 2024年8月20日作成者: jarxiv

要約自動運転の進歩により、環境認識から車両のナビゲーションと制御に至るまで、あ … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

月別アーカイブ: 2024年8月

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Identifying Query-Relevant Neurons in Large Language Models for Long-Form Texts

‘Image, Tell me your story!’ Predicting the original meta-context of visual misinformation

PLUTUS: A Well Pre-trained Large Unified Transformer can Unveil Financial Time Series Regularities

Constructing Domain-Specific Evaluation Sets for LLM-as-a-judge

LoopSplat: Loop Closure by Registering 3D Gaussian Splats

AdaResNet: Enhancing Residual Networks with Dynamic Weight Adjustment for Improved Feature Integration

DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields

Intuitive Human-Robot Interface: A 3-Dimensional Action Recognition and UAV Collaboration Framework

V2X-VLM: End-to-End V2X Cooperative Autonomous Driving Through Large Vision-Language Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー