月別アーカイブ: 2024年6月

An End-to-End, Segmentation-Free, Arabic Handwritten Recognition Model on KHATT

投稿日: 2024年6月24日作成者: jarxiv

要約特徴抽出に DCNN を活用し、シーケンス認識に双方向長短期記憶 (BLS … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Masked Extended Attention for Zero-Shot Virtual Try-On In The Wild

投稿日: 2024年6月24日作成者: jarxiv

要約仮想試着 (VTON) は非常に活発な研究分野であり、需要が高まっています … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.LG | コメントを受け付けていません

GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation

投稿日: 2024年6月24日作成者: jarxiv

要約この研究では、GeoLRM (GeoLRM) を導入します。これは、わずか … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning

投稿日: 2024年6月24日作成者: jarxiv

要約少数ショット学習におけるインターリーブ大規模マルチモーダルモデル (LM … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Keystroke Dynamics Against Academic Dishonesty in the Age of LLMs

投稿日: 2024年6月24日作成者: jarxiv

要約オンライン試験や課題への移行により、学問の誠実さについて大きな懸念が生じて … 続きを読む →

カテゴリー: cs.CV, cs.CY, I.5.4 | コメントを受け付けていません

Image Conductor: Precision Control for Interactive Video Synthesis

投稿日: 2024年6月24日作成者: jarxiv

要約映画制作やアニメーション制作では、多くの場合、カメラのトランジションやオブ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM | コメントを受け付けていません

Full-Scale Indexing and Semantic Annotation of CT Imaging: Boosting FAIRness

投稿日: 2024年6月24日作成者: jarxiv

要約背景: 人工知能の医療への統合により、特に診断と治療計画において大きな進歩 … 続きを読む →

カテゴリー: cs.CV, eess.IV, I.4 | コメントを受け付けていません

NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

投稿日: 2024年6月24日作成者: jarxiv

要約ビジョンに基づいた運転政策のベンチマークは困難です。一方で、実際のデータ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Are LLMs Naturally Good at Synthetic Tabular Data Generation?

投稿日: 2024年6月24日作成者: jarxiv

要約大規模言語モデル (LLM) は、合成テキストと画像を生成する能力を実証し … 続きを読む →

カテゴリー: cs.LG | コメントを受け付けていません

Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation

投稿日: 2024年6月24日作成者: jarxiv

要約詳細な視覚的分類 (FGVC) には、密接に関連したサブクラスの分類が含ま … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年6月

An End-to-End, Segmentation-Free, Arabic Handwritten Recognition Model on KHATT

Masked Extended Attention for Zero-Shot Virtual Try-On In The Wild

GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation

Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning

Keystroke Dynamics Against Academic Dishonesty in the Age of LLMs

Image Conductor: Precision Control for Interactive Video Synthesis

Full-Scale Indexing and Semantic Annotation of CT Imaging: Boosting FAIRness

NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

Are LLMs Naturally Good at Synthetic Tabular Data Generation?

Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation

最近の投稿

最近のコメント

アーカイブ

カテゴリー