月別アーカイブ: 2024年2月

AutoGCN — Towards Generic Human Activity Recognition with Neural Architecture Search

投稿日: 2024年2月5日作成者: jarxiv

要約本稿では、グラフ畳み込みネットワーク(GCN)を用いた人間行動認識(HAR … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

A general framework for rotation invariant point cloud analysis

投稿日: 2024年2月5日作成者: jarxiv

要約我々は、ディープラーニングに基づく点群解析のための、入力の回転に不変な一般 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Simulator-Free Visual Domain Randomization via Video Games

投稿日: 2024年2月5日作成者: jarxiv

要約ドメインランダマイゼーションは、類似した内容を示す視覚的に異なるドメイン間 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

InstructPix2NeRF: Instructed 3D Portrait Editing from a Single Image

投稿日: 2024年2月5日作成者: jarxiv

要約 3Dを意識したポートレート編集におけるNeural Radiance Fi … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Skip $\textbackslash n$: A simple method to reduce hallucination in Large Vision-Language Models

投稿日: 2024年2月5日作成者: jarxiv

要約近年の大規模視覚言語モデル（LVLM）の進歩により、人間の言語による視覚情 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Describing Images $\textit{Fast and Slow}$: Quantifying and Predicting the Variation in Human Signals during Visuo-Linguistic Processes

投稿日: 2024年2月5日作成者: jarxiv

要約画像の特性と、人間がその画像を描写するときの振る舞いには複雑な関係がある。 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Pix4Point: Image Pretrained Standard Transformers for 3D Point Cloud Understanding

投稿日: 2024年2月5日作成者: jarxiv

要約 Transformersは自然言語処理やコンピュータビジョンでは目覚ましい … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

FindingEmo: An Image Dataset for Emotion Recognition in the Wild

投稿日: 2024年2月5日作成者: jarxiv

要約 FindingEmoは、2万5千枚の画像に対するアノテーションを含む、感情 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

LIR: Efficient Degradation Removal for Lightweight Image Restoration

投稿日: 2024年2月5日作成者: jarxiv

要約近年、CNNと変換器に基づく画像復元が大きく進歩している。しかし、画像復元 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Cheating Suffix: Targeted Attack to Text-To-Image Diffusion Models with Multi-Modal Priors

投稿日: 2024年2月5日作成者: jarxiv

要約拡散モデルは様々な画像生成タスクに広く導入されており、画像とテキストモダリ … 続きを読む →

カテゴリー: cs.CR, cs.CV, cs.LG | コメントを受け付けていません

月別アーカイブ: 2024年2月

AutoGCN — Towards Generic Human Activity Recognition with Neural Architecture Search

A general framework for rotation invariant point cloud analysis

Simulator-Free Visual Domain Randomization via Video Games

InstructPix2NeRF: Instructed 3D Portrait Editing from a Single Image

Skip $\textbackslash n$: A simple method to reduce hallucination in Large Vision-Language Models

Describing Images $\textit{Fast and Slow}$: Quantifying and Predicting the Variation in Human Signals during Visuo-Linguistic Processes

Pix4Point: Image Pretrained Standard Transformers for 3D Point Cloud Understanding

FindingEmo: An Image Dataset for Emotion Recognition in the Wild

LIR: Efficient Degradation Removal for Lightweight Image Restoration

Cheating Suffix: Targeted Attack to Text-To-Image Diffusion Models with Multi-Modal Priors

最近の投稿

最近のコメント

アーカイブ

カテゴリー