投稿者「jarxiv」のアーカイブ

ARMOR: Empowering Multimodal Understanding Model with Interleaved Multimodal Generation Capability

投稿日: 2025年6月9日作成者: jarxiv

要約統一されたマルチモーダルの理解と世代は最近、ビジョンと言語の分野で多くの注 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

A Novel Large-scale Crop Dataset and Dual-stream Transformer Method for Fine-grained Hierarchical Crop Classification from Integrated Hyperspectral EnMAP Data and Multispectral Sentinel-2 Time Series

投稿日: 2025年6月9日作成者: jarxiv

要約精密な農業と食料安全保障の監視には、細粒の作物分類が重要です。フェノロジ … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

In Search of Forgotten Domain Generalization

投稿日: 2025年6月9日作成者: jarxiv

要約ドメイン外（OOD）一般化は、1つ以上のドメインで訓練されたモデルの能力が … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Technical Report for Egocentric Mistake Detection for the HoloAssist Challenge

投稿日: 2025年6月9日作成者: jarxiv

要約このレポートでは、産業の自動化や教育などのドメインで不可欠なオンラインミス … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SatelliteFormula: Multi-Modal Symbolic Regression from Remote Sensing Imagery for Physics Discovery

投稿日: 2025年6月9日作成者: jarxiv

要約マルチスペクトルのリモートセンシング画像から物理的に解釈可能な表現を直接導 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

From Prototypes to General Distributions: An Efficient Curriculum for Masked Image Modeling

投稿日: 2025年6月9日作成者: jarxiv

要約 Masked Image Modeling（MIM）は、視覚表現学習のため … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SemiOccam: A Robust Semi-Supervised Image Recognition Network Using Sparse Labels

投稿日: 2025年6月9日作成者: jarxiv

要約 Semioccamは、非常に効率的な方法で半学習学習を活用する画像認識ネッ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model

投稿日: 2025年6月9日作成者: jarxiv

要約操作は長い間ロボットにとって挑戦的な作業でしたが、人間はマグカップラックに … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Pseudo-labelling meets Label Smoothing for Noisy Partial Label Learning

投稿日: 2025年6月9日作成者: jarxiv

要約完全に注釈されたデータセットをキュレートすることが高価であり、きめの分類な … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

PuzzleWorld: A Benchmark for Multimodal, Open-Ended Reasoning in Puzzlehunts

投稿日: 2025年6月9日作成者: jarxiv

要約 Puzzlehuntsは、明確に定義された問題の定義を欠いている複雑でマル … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

ARMOR: Empowering Multimodal Understanding Model with Interleaved Multimodal Generation Capability

A Novel Large-scale Crop Dataset and Dual-stream Transformer Method for Fine-grained Hierarchical Crop Classification from Integrated Hyperspectral EnMAP Data and Multispectral Sentinel-2 Time Series

In Search of Forgotten Domain Generalization

Technical Report for Egocentric Mistake Detection for the HoloAssist Challenge

SatelliteFormula: Multi-Modal Symbolic Regression from Remote Sensing Imagery for Physics Discovery

From Prototypes to General Distributions: An Efficient Curriculum for Masked Image Modeling

SemiOccam: A Robust Semi-Supervised Image Recognition Network Using Sparse Labels

3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model

Pseudo-labelling meets Label Smoothing for Noisy Partial Label Learning

PuzzleWorld: A Benchmark for Multimodal, Open-Ended Reasoning in Puzzlehunts

最近の投稿

最近のコメント

アーカイブ

カテゴリー