月別アーカイブ: 2024年5月

Unveiling and Mitigating Backdoor Vulnerabilities based on Unlearning Weight Changes and Backdoor Activeness

投稿日: 2024年5月31日作成者: jarxiv

要約バックドア攻撃によるセキュリティの脅威は、ディープニューラルネットワー … 続きを読む →

カテゴリー: cs.CR, cs.CV | コメントを受け付けていません

You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism

投稿日: 2024年5月31日作成者: jarxiv

要約スケーリングドットプロダクトアテンション (SDPA) は、多くの最 … 続きを読む →

カテゴリー: (Primary), 15A03, 15A04, 68T10, 68T50, cs.AI, cs.CL, cs.CV, cs.LG, I.2.10 | コメントを受け付けていません

Scaling White-Box Transformers for Vision

投稿日: 2024年5月31日作成者: jarxiv

要約 CRATE は、圧縮表現とスパース表現を学習するために設計されたホワイトボ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Can’t make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models

投稿日: 2024年5月31日作成者: jarxiv

要約現実世界でありそうなアクションシーケンスを予測するための大規模なビデオ言 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Multi-Prompt Alignment for Multi-Source Unsupervised Domain Adaptation

投稿日: 2024年5月31日作成者: jarxiv

要約教師なしドメインアダプテーション (UDA) の既存の方法のほとんどは、 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

A Pixel Is Worth More Than One 3D Gaussians in Single-View 3D Reconstruction

投稿日: 2024年5月31日作成者: jarxiv

要約シングルビュー画像から 3D シーン表現を学習することは、入力ビューからは … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

ParSEL: Parameterized Shape Editing with Language

投稿日: 2024年5月31日作成者: jarxiv

要約自然言語から 3D アセットを編集できる機能は、3D コンテンツ作成の民主 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR, cs.HC, cs.SC | コメントを受け付けていません

Improving the Training of Rectified Flows

投稿日: 2024年5月31日作成者: jarxiv

要約拡散モデルは画像やビデオの生成に大きな期待を寄せていますが、最先端のモデル … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Vision-based Manipulation from Single Human Video with Open-World Object Graphs

投稿日: 2024年5月31日作成者: jarxiv

要約私たちは、ロボットが人間のビデオから視覚ベースの操作スキルを学習できるよう … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

$\textit{S}^3$Gaussian: Self-Supervised Street Gaussians for Autonomous Driving

投稿日: 2024年5月31日作成者: jarxiv

要約ストリートシーンのフォトリアリスティックな 3D 再構築は、自動運転用の実 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年5月

Unveiling and Mitigating Backdoor Vulnerabilities based on Unlearning Weight Changes and Backdoor Activeness

You Need to Pay Better Attention: Rethinking the Mathematics of Attention Mechanism

Scaling White-Box Transformers for Vision

Can’t make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models

Multi-Prompt Alignment for Multi-Source Unsupervised Domain Adaptation

A Pixel Is Worth More Than One 3D Gaussians in Single-View 3D Reconstruction

ParSEL: Parameterized Shape Editing with Language

Improving the Training of Rectified Flows

Vision-based Manipulation from Single Human Video with Open-World Object Graphs

$\textit{S}^3$Gaussian: Self-Supervised Street Gaussians for Autonomous Driving

最近の投稿

最近のコメント

アーカイブ

カテゴリー