投稿者「jarxiv」のアーカイブ

SNAP: A Benchmark for Testing the Effects of Capture Conditions on Fundamental Vision Tasks

投稿日: 2025年5月22日作成者: jarxiv

要約 Deep-Rearningベースの（DL）コンピュータービジョンアルゴリズ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Oral Imaging for Malocclusion Issues Assessments: OMNI Dataset, Deep Learning Baselines and Benchmarking

投稿日: 2025年5月22日作成者: jarxiv

要約不正咬合は歯科矯正の主要な課題であり、その複雑な症状と多様な臨床症状により … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

FragFake: A Dataset for Fine-Grained Detection of Edited Images with Vision Language Models

投稿日: 2025年5月22日作成者: jarxiv

要約特に、最新の拡散モデルと画像編集方法が非常に現実的な操作を生成する可能性が … 続きを読む →

カテゴリー: cs.AI, cs.CR, cs.CV | コメントを受け付けていません

How far can we go with ImageNet for Text-to-Image generation?

投稿日: 2025年5月22日作成者: jarxiv

要約最近のテキストからイメージの生成モデルは、「より大きなISが優れている」パ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

The Devil is in Fine-tuning and Long-tailed Problems:A New Benchmark for Scene Text Detection

投稿日: 2025年5月22日作成者: jarxiv

要約シーンのテキスト検出では、アカデミックベンチマークで優れた高性能な方法の出 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization

投稿日: 2025年5月22日作成者: jarxiv

要約目に見えないタスクに対するビジョン言語アクション（VLA）モデルの一般化能 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Gompertz Linear Units: Leveraging Asymmetry for Enhanced Learning Dynamics

投稿日: 2025年5月22日作成者: jarxiv

要約活性化関数は、トレーニングのダイナミクスに大きな影響を与えるため、深い学習 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Faster Video Diffusion with Trainable Sparse Attention

投稿日: 2025年5月22日作成者: jarxiv

要約スケーリングビデオ拡散変圧器（DITS）は、ほとんどの注意質量が位置の小さ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Enhancing Monte Carlo Dropout Performance for Uncertainty Quantification

投稿日: 2025年5月22日作成者: jarxiv

要約深いニューラルネットワークの出力に関連する不確実性を知ることは、特に医療診 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Learning Task-preferred Inference Routes for Gradient De-conflict in Multi-output DNNs

投稿日: 2025年5月22日作成者: jarxiv

要約マルチアウトプットディープニューラルネットワーク（MON）には複数のタスク … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

SNAP: A Benchmark for Testing the Effects of Capture Conditions on Fundamental Vision Tasks

Oral Imaging for Malocclusion Issues Assessments: OMNI Dataset, Deep Learning Baselines and Benchmarking

FragFake: A Dataset for Fine-Grained Detection of Edited Images with Vision Language Models

How far can we go with ImageNet for Text-to-Image generation?

The Devil is in Fine-tuning and Long-tailed Problems:A New Benchmark for Scene Text Detection

Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization

Gompertz Linear Units: Leveraging Asymmetry for Enhanced Learning Dynamics

Faster Video Diffusion with Trainable Sparse Attention

Enhancing Monte Carlo Dropout Performance for Uncertainty Quantification

Learning Task-preferred Inference Routes for Gradient De-conflict in Multi-output DNNs

最近の投稿

最近のコメント

アーカイブ

カテゴリー