月別アーカイブ: 2024年7月

General Geometry-aware Weakly Supervised 3D Object Detection

投稿日: 2024年7月19日作成者: jarxiv

要約 3D オブジェクト検出は、シーンを理解するために不可欠なコンポーネントです … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Pose-guided multi-task video transformer for driver action recognition

投稿日: 2024年7月19日作成者: jarxiv

要約私たちは、車内ビデオの分析を通じて脇見運転の状況を特定するタスクを調査しま … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

LogoSticker: Inserting Logos into Diffusion Models for Customized Generation

投稿日: 2024年7月19日作成者: jarxiv

要約テキストから画像へのモデルのカスタマイズにおける最近の進歩により、新しい概 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Exploring Facial Biomarkers for Depression through Temporal Analysis of Action Units

投稿日: 2024年7月19日作成者: jarxiv

要約うつ病は、持続的な悲しみと興味の喪失を特徴とし、日常生活の機能を著しく損な … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion

投稿日: 2024年7月19日作成者: jarxiv

要約我々は、オンザフライで合成された都市スケールのシーンを通じて、街並みに相当 … 続きを読む →

カテゴリー: cs.CV, cs.GR | コメントを受け付けていません

SegPoint: Segment Any Point Cloud via Large Language Model

投稿日: 2024年7月19日作成者: jarxiv

要約 3D 点群セグメンテーションが大幅に進歩しているにもかかわらず、既存の方法 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Shape of Motion: 4D Reconstruction from a Single Video

投稿日: 2024年7月19日作成者: jarxiv

要約単眼の動的再構成は、非常に不適切な作業であるため、長年にわたる困難な視覚問 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Visual Haystacks: Answering Harder Questions About Sets of Images

投稿日: 2024年7月19日作成者: jarxiv

要約大規模マルチモーダルモデル (LMM) の最近の進歩により、単一画像によ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models

投稿日: 2024年7月19日作成者: jarxiv

要約変圧器モデルは大成功を収めているにもかかわらず、依然として詳細に拡張するこ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG, I.2.10 | コメントを受け付けていません

Addressing Imbalance for Class Incremental Learning in Medical Image Classification

投稿日: 2024年7月19日作成者: jarxiv

要約ディープ畳み込みニューラルネットワークは、すべてのクラスのトレーニング … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年7月

General Geometry-aware Weakly Supervised 3D Object Detection

Pose-guided multi-task video transformer for driver action recognition

LogoSticker: Inserting Logos into Diffusion Models for Customized Generation

Exploring Facial Biomarkers for Depression through Temporal Analysis of Action Units

Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion

SegPoint: Segment Any Point Cloud via Large Language Model

Shape of Motion: 4D Reconstruction from a Single Video

Visual Haystacks: Answering Harder Questions About Sets of Images

Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models

Addressing Imbalance for Class Incremental Learning in Medical Image Classification

最近の投稿

最近のコメント

アーカイブ

カテゴリー