月別アーカイブ: 2024年1月

PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

投稿日: 2024年1月1日作成者: jarxiv

要約最先端の Text-to-Image (T2I) モデルには多額のトレーニ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

投稿日: 2024年1月1日作成者: jarxiv

要約拡散モデルは画像間 (I2I) 合成を変革し、現在ではビデオに浸透していま … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

Multiscale Vision Transformers meet Bipartite Matching for efficient single-stage Action Localization

投稿日: 2024年1月1日作成者: jarxiv

要約アクションの位置特定は、検出タスクと認識タスクを組み合わせた難しい問題であ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Comparing the robustness of modern no-reference image- and video-quality metrics to adversarial attacks

投稿日: 2024年1月1日作成者: jarxiv

要約現在、ニューラルネットワークベースの画像およびビデオ品質メトリクスは、 … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.MM, eess.IV | コメントを受け付けていません

Toward Spatial Temporal Consistency of Joint Visual Tactile Perception in VR Applications

投稿日: 2024年1月1日作成者: jarxiv

要約 VR テクノロジーの発展、特にメタバース概念の出現により、視覚と触覚の統合 … 続きを読む →

カテゴリー: cs.RO | コメントを受け付けていません

Difficulties in Dynamic Analysis of Drone Firmware and Its Solutions

投稿日: 2024年1月1日作成者: jarxiv

要約モノのインターネット (IoT) テクノロジーの進歩により、その応用は公共 … 続きを読む →

カテゴリー: cs.CR, cs.RO | コメントを受け付けていません

Length Extrapolation of Transformers: A Survey from the Perspective of Position Encoding

投稿日: 2024年1月1日作成者: jarxiv

要約 Transformer は、シーケンス内の複雑な依存関係をモデル化する優れ … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Experiential Co-Learning of Software-Developing Agents

投稿日: 2024年1月1日作成者: jarxiv

要約大規模言語モデル (LLM) の最近の進歩により、特に LLM 駆動の自律 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, cs.SE | コメントを受け付けていません

AccidentGPT: Accident Analysis and Prevention from V2X Environmental Perception with Multi-modal Large Model

投稿日: 2024年1月1日作成者: jarxiv

要約交通事故は人的被害と物的損害の両方に大きく寄与しており、交通安全分野の多く … 続きを読む →

カテゴリー: cs.AI, cs.CE | コメントを受け付けていません

DAP: Domain-aware Prompt Learning for Vision-and-Language Navigation

投稿日: 2024年1月1日作成者: jarxiv

要約言語の指示に従って目に見えない環境をナビゲートすることは、自律的に身体化さ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年1月

PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

Multiscale Vision Transformers meet Bipartite Matching for efficient single-stage Action Localization

Comparing the robustness of modern no-reference image- and video-quality metrics to adversarial attacks

Toward Spatial Temporal Consistency of Joint Visual Tactile Perception in VR Applications

Difficulties in Dynamic Analysis of Drone Firmware and Its Solutions

Length Extrapolation of Transformers: A Survey from the Perspective of Position Encoding

Experiential Co-Learning of Software-Developing Agents

AccidentGPT: Accident Analysis and Prevention from V2X Environmental Perception with Multi-modal Large Model

DAP: Domain-aware Prompt Learning for Vision-and-Language Navigation

最近の投稿

最近のコメント

アーカイブ

カテゴリー