「cs.CV」カテゴリーアーカイブ

Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey

投稿日: 2024年8月15日作成者: jarxiv

要約 Transformers LLM の大幅な進歩により、NLP はテキスト生 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CR, cs.CV, eess.AS | コメントを受け付けていません

InternVideo2: Scaling Foundation Models for Multimodal Video Understanding

投稿日: 2024年8月15日作成者: jarxiv

要約ビデオ認識、ビデオテキストタスク、およびビデオ中心の対話において最先端 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

DeepFace-Attention: Multimodal Face Biometrics for Attention Estimation with Application to e-Learning

投稿日: 2024年8月15日作成者: jarxiv

要約この研究では、ウェブカメラのビデオに適用された一連の顔分析技術を使用して、 … 続きを読む →

カテゴリー: cs.CV, cs.HC | コメントを受け付けていません

Progressive Radiance Distillation for Inverse Rendering with Gaussian Splatting

投稿日: 2024年8月15日作成者: jarxiv

要約我々は、蒸留進行マップを使用して物理ベースのレンダリングとガウスベースの放 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Disentangle and denoise: Tackling context misalignment for video moment retrieval

投稿日: 2024年8月15日作成者: jarxiv

要約ビデオモーメント検索は、自然言語クエリに従ってコンテキスト内のビデオモーメ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving

投稿日: 2024年8月15日作成者: jarxiv

要約自動運転の分野では、高品質の注釈付きビデオトレーニングデータの需要がま … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning

投稿日: 2024年8月15日作成者: jarxiv

要約トランスフォーマーや CLIP などのビジョン言語モデル (VLM) の出 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Rethinking the Key Factors for the Generalization of Remote Sensing Stereo Matching Networks

投稿日: 2024年8月15日作成者: jarxiv

要約 3D 再構成の重要なステップであるステレオマッチングは、リモートセンシ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Compact Model Training by Low-Rank Projection with Energy Transfer

投稿日: 2024年8月15日作成者: jarxiv

要約低順位は従来の機械学習では重要な役割を果たしますが、深層学習ではあまり一般 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Evolving from Single-modal to Multi-modal Facial Deepfake Detection: A Survey

投稿日: 2024年8月15日作成者: jarxiv

要約この調査は、人工知能の急速な進歩の中でディープフェイク検出という重大な課題 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey

InternVideo2: Scaling Foundation Models for Multimodal Video Understanding

DeepFace-Attention: Multimodal Face Biometrics for Attention Estimation with Application to e-Learning

Progressive Radiance Distillation for Inverse Rendering with Gaussian Splatting

Disentangle and denoise: Tackling context misalignment for video moment retrieval

Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving

CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning

Rethinking the Key Factors for the Generalization of Remote Sensing Stereo Matching Networks

Compact Model Training by Low-Rank Projection with Energy Transfer

Evolving from Single-modal to Multi-modal Facial Deepfake Detection: A Survey

最近の投稿

最近のコメント

アーカイブ

カテゴリー