「cs.CV」カテゴリーアーカイブ

LogogramNLP: Comparing Visual and Textual Representations of Ancient Logographic Writing Systems for NLP

投稿日: 2024年8月9日作成者: jarxiv

要約標準の自然言語処理 (NLP) パイプラインは、通常、一連の離散トークンで … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics

投稿日: 2024年8月9日作成者: jarxiv

要約パーツレベルのダイナミクスの事前モーションとして機能するインタラクティブな … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Arctic-TILT. Business Document Understanding at Sub-Billion Scale

投稿日: 2024年8月9日作成者: jarxiv

要約 LLM を採用するワークロードの大部分には、PDF またはスキャンコンテ … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

LiDAR-Event Stereo Fusion with Hallucinations

投稿日: 2024年8月9日作成者: jarxiv

要約イベントステレオマッチングは、ニューロモーフィックカメラから深度を推 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

ESP-MedSAM: Efficient Self-Prompting SAM for Universal Domain-Generalized Image Segmentation

投稿日: 2024年8月9日作成者: jarxiv

要約さまざまなモダリティにわたるディープニューラルネットワークの普遍性と、 … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses

投稿日: 2024年8月9日作成者: jarxiv

要約ビデオグラウンディングは、マルチモーダルコンテンツの理解における基本的 … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

Compression-Realized Deep Structural Network for Video Quality Enhancement

投稿日: 2024年8月9日作成者: jarxiv

要約このペーパーでは、圧縮ビデオの品質向上のタスクに焦点を当てます。ディープ … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Advancing Prompt Learning through an External Layer

投稿日: 2024年8月9日作成者: jarxiv

要約プロンプト学習は、一連のテキスト埋め込みを学習することで、事前トレーニング … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries

投稿日: 2024年8月8日作成者: jarxiv

要約自動運転車計画アルゴリズムのパフォーマンスを評価するには、ロングテールの安 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO, I.2.6 | コメントを受け付けていません

Opening the Black Box of 3D Reconstruction Error Analysis with VECTOR

投稿日: 2024年8月8日作成者: jarxiv

要約 2D 画像から 3D シーンを再構成することは、地球惑星科学や宇宙探査から … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

LogogramNLP: Comparing Visual and Textual Representations of Ancient Logographic Writing Systems for NLP

Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics

Arctic-TILT. Business Document Understanding at Sub-Billion Scale

LiDAR-Event Stereo Fusion with Hallucinations

ESP-MedSAM: Efficient Self-Prompting SAM for Universal Domain-Generalized Image Segmentation

SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses

Compression-Realized Deep Structural Network for Video Quality Enhancement

Advancing Prompt Learning through an External Layer

SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries

Opening the Black Box of 3D Reconstruction Error Analysis with VECTOR

最近の投稿

最近のコメント

アーカイブ

カテゴリー