月別アーカイブ: 2024年9月

The Impact of Print-Scanning in Heterogeneous Morph Evaluation Scenarios

投稿日: 2024年9月4日作成者: jarxiv

要約フェイスモーフィング攻撃は、顔認識（FR）システムにとってますます脅威とな … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Cross-Platform Video Person ReID: A New Benchmark Dataset and Adaptation Approach

投稿日: 2024年9月4日作成者: jarxiv

要約 G2A-VReIDは、185,907の画像と5,576のトラックレットから … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic Segmentation

投稿日: 2024年9月4日作成者: jarxiv

要約コントラスト言語画像事前学習(CLIP)は、最近ピクセルレベルのゼロショッ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection

投稿日: 2024年9月4日作成者: jarxiv

要約弱教師付きビデオ異常検出（WS-VAD）のほとんどのモデルは、異常のタイプ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Towards reliable respiratory disease diagnosis based on cough sounds and vision transformers

投稿日: 2024年9月4日作成者: jarxiv

要約近年のディープラーニング技術の進歩により、マルチモーダルな医療データに基づ … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming

投稿日: 2024年9月4日作成者: jarxiv

要約現在のマルチモーダル大規模言語モデル（MLLM）は、文書画像に典型的な高解 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Image-Based Virtual Try-On: A Survey

投稿日: 2024年9月4日作成者: jarxiv

要約画像ベースのバーチャル試着は、自然な服装をした人物の画像と衣服の画像を合成 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Learning from the Web: Language Drives Weakly-Supervised Incremental Learning for Semantic Segmentation

投稿日: 2024年9月4日作成者: jarxiv

要約現在の弱教師付き逐次学習によるセマンティックセグメンテーション（WILSS … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Correlation-Embedded Transformer Tracking: A Single-Branch Framework

投稿日: 2024年9月4日作成者: jarxiv

要約ロバストで識別可能な外観モデルの開発は、視覚物体追跡における長年の研究課題 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?

投稿日: 2024年9月4日作成者: jarxiv

要約本論文では、表現学習の自然な目的は、データの分布、例えばトークンの集合を、 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

月別アーカイブ: 2024年9月

The Impact of Print-Scanning in Heterogeneous Morph Evaluation Scenarios

Cross-Platform Video Person ReID: A New Benchmark Dataset and Adaptation Approach

TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic Segmentation

Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection

Towards reliable respiratory disease diagnosis based on cough sounds and vision transformers

DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming

Image-Based Virtual Try-On: A Survey

Learning from the Web: Language Drives Weakly-Supervised Incremental Learning for Semantic Segmentation

Correlation-Embedded Transformer Tracking: A Single-Branch Framework

White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?

最近の投稿

最近のコメント

アーカイブ

カテゴリー