投稿者「jarxiv」のアーカイブ

Taming the Randomness: Towards Label-Preserving Cropping in Contrastive Learning

投稿日: 2025年4月29日作成者: jarxiv

要約対照学習（CL）アプローチは、自己教師学習（SSL）方法の非常に成功したサ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

HOIGaze: Gaze Estimation During Hand-Object Interactions in Extended Reality Exploiting Eye-Hand-Head Coordination

投稿日: 2025年4月29日作成者: jarxiv

要約 Hoigazeを提示します – 拡張現実（XR）におけるハンド … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

AnimateAnywhere: Rouse the Background in Human Image Animation

投稿日: 2025年4月29日作成者: jarxiv

要約 Human Image Animationは、目的のポーズシーケンスを順守 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SRMF: A Data Augmentation and Multimodal Fusion Approach for Long-Tail UHR Satellite Image Segmentation

投稿日: 2025年4月29日作成者: jarxiv

要約ロングテールの問題は、超高解像度（UHR）衛星画像におけるセマンティックセ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Foundation Model-Driven Framework for Human-Object Interaction Prediction with Segmentation Mask Integration

投稿日: 2025年4月29日作成者: jarxiv

要約この作業では、セグメンテーションベースのビジョンファンデーションモデルと従 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

DD-rPPGNet: De-interfering and Descriptive Feature Learning for Unsupervised rPPG Estimation

投稿日: 2025年4月29日作成者: jarxiv

要約リモートフォトプレチスモグラフィ（RPPG）は、フェイシャルビデオの生理学 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks

投稿日: 2025年4月29日作成者: jarxiv

要約既存の視覚言語アクション（VLA）モデルは、ゼロショットシナリオで有望なパ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback

投稿日: 2025年4月29日作成者: jarxiv

要約スコア蒸留サンプリング（SDS）は、テキストから3Dのコンテンツ生成で顕著 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Towards Ball Spin and Trajectory Analysis in Table Tennis Broadcast Videos via Physically Grounded Synthetic-to-Real Transfer

投稿日: 2025年4月29日作成者: jarxiv

要約卓球のプレーヤーのテクニックを分析するには、ボールの3D軌道とスピンに関す … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation

投稿日: 2025年4月29日作成者: jarxiv

要約プロトタイプのパーツ学習は、セマンティックセグメンテーションを解釈可能にす … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

Taming the Randomness: Towards Label-Preserving Cropping in Contrastive Learning

HOIGaze: Gaze Estimation During Hand-Object Interactions in Extended Reality Exploiting Eye-Hand-Head Coordination

AnimateAnywhere: Rouse the Background in Human Image Animation

SRMF: A Data Augmentation and Multimodal Fusion Approach for Long-Tail UHR Satellite Image Segmentation

Foundation Model-Driven Framework for Human-Object Interaction Prediction with Segmentation Mask Integration

DD-rPPGNet: De-interfering and Descriptive Feature Learning for Unsupervised rPPG Estimation

NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks

CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback

Towards Ball Spin and Trajectory Analysis in Table Tennis Broadcast Videos via Physically Grounded Synthetic-to-Real Transfer

Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation

最近の投稿

最近のコメント

アーカイブ

カテゴリー