月別アーカイブ: 2023年7月

Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models

投稿日: 2023年7月14日作成者: jarxiv

要約 Text-to-image (T2I) パーソナライゼーションにより、ユー … 続きを読む →

カテゴリー: cs.CV, cs.GR, cs.LG | コメントを受け付けていません

mBLIP: Efficient Bootstrapping of Multilingual Vision-LLMs

投稿日: 2023年7月14日作成者: jarxiv

要約モジュール式ビジョン言語モデル (Vision-LLM) は、事前トレーニ … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation

投稿日: 2023年7月14日作成者: jarxiv

要約視覚的なストーリーテリング用のビデオの生成は、通常、実写撮影またはグラフィ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

On the Connection between Game-Theoretic Feature Attributions and Counterfactual Explanations

投稿日: 2023年7月14日作成者: jarxiv

要約説明可能な人工知能 (XAI) は近年広く関心を集めており、最も人気のある … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GT, cs.HC, cs.LG, I.2 | コメントを受け付けていません

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

投稿日: 2023年7月14日作成者: jarxiv

要約この論文では、マルチモーダルの理解と生成のための強力で転送可能なビデオテ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition

投稿日: 2023年7月14日作成者: jarxiv

要約最近のビデオ認識モデルは、長距離の時空間コンテキストモデリングに Tra … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Self-regulating Prompts: Foundational Model Adaptation without Forgetting

投稿日: 2023年7月14日作成者: jarxiv

要約即時学習は、さまざまな下流タスク向けに CLIP などの基本モデルを微調整 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

投稿日: 2023年7月14日作成者: jarxiv

要約パーソナライゼーションは、生成 AI の分野で顕著な側面として浮上しており … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR, cs.LG | コメントを受け付けていません

Provably Faster Gradient Descent via Long Steps

投稿日: 2023年7月14日作成者: jarxiv

要約この研究では、コンピューター支援分析技術を使用して、勾配降下法でより高速な … 続きを読む →

カテゴリー: cs.LG, cs.NA, math.NA, math.OC | コメントを受け付けていません

PatternGPT :A Pattern-Driven Framework for Large Language Model Text Generation

投稿日: 2023年7月14日作成者: jarxiv

要約大規模言語モデル (LLMS) は、多くの下流タスクに対して流暢な応答を生 … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

月別アーカイブ: 2023年7月

Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models

mBLIP: Efficient Bootstrapping of Multilingual Vision-LLMs

Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation

On the Connection between Game-Theoretic Feature Attributions and Counterfactual Explanations

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition

Self-regulating Prompts: Foundational Model Adaptation without Forgetting

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

Provably Faster Gradient Descent via Long Steps

PatternGPT :A Pattern-Driven Framework for Large Language Model Text Generation

最近の投稿

最近のコメント

アーカイブ

カテゴリー