「cs.CV」カテゴリーアーカイブ

ADAT: Time-Series-Aware Adaptive Transformer Architecture for Sign Language Translation

投稿日: 2025年4月17日作成者: jarxiv

要約現在の手話機械の翻訳システムは、標識をテキストに変換するために、手の動き、 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, I.2.10 | コメントを受け付けていません

Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation

投稿日: 2025年4月17日作成者: jarxiv

要約大規模なマルチモーダル言語モデルの出現により、Scienceは現在、AIベ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

RadMamba: Efficient Human Activity Recognition through Radar-based Micro-Doppler-Oriented Mamba State-Space Model

投稿日: 2025年4月17日作成者: jarxiv

要約レーダーベースのHARは、独自のプライバシーの保存と堅牢性の利点により、ウ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models

投稿日: 2025年4月17日作成者: jarxiv

要約リモートセンシングにおける豊富でよく目立たないマルチモーダルデータは、複雑 … 続きを読む →

カテゴリー: cs.AI, cs.CV, I.2.10 | コメントを受け付けていません

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

投稿日: 2025年4月17日作成者: jarxiv

要約テキスト間拡散モデルにより、テキストの指示に従う高品質のビデオの生成を可能 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG, cs.MM, eess.IV | コメントを受け付けていません

AttentionDrop: A Novel Regularization Method for Transformer Models

投稿日: 2025年4月17日作成者: jarxiv

要約変圧器ベースのアーキテクチャは、自然言語処理、コンピュータービジョン、およ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Generalized Visual Relation Detection with Diffusion Models

投稿日: 2025年4月17日作成者: jarxiv

要約視覚関係検出（VRD）は、画像内のオブジェクトペア間の関係（または相互作用 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Collaborative Learning for Enhanced Unsupervised Domain Adaptation

投稿日: 2025年4月17日作成者: jarxiv

要約監視されていないドメイン適応（UDA）は、ラベル付けされたソースドメインで … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image

投稿日: 2025年4月17日作成者: jarxiv

要約さまざまなコンピュータービジョンアプリケーションでは、正確で一般化可能なメ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Logits DeConfusion with CLIP for Few-Shot Learning

投稿日: 2025年4月17日作成者: jarxiv

要約強力な視覚言語アライメント機能を備えたClipは、ゼロショットと少ないショ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

ADAT: Time-Series-Aware Adaptive Transformer Architecture for Sign Language Translation

Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation

RadMamba: Efficient Human Activity Recognition through Radar-based Micro-Doppler-Oriented Mamba State-Space Model

RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

AttentionDrop: A Novel Regularization Method for Transformer Models

Generalized Visual Relation Detection with Diffusion Models

Collaborative Learning for Enhanced Unsupervised Domain Adaptation

Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image

Logits DeConfusion with CLIP for Few-Shot Learning

最近の投稿

最近のコメント

アーカイブ

カテゴリー