「cs.AI」カテゴリーアーカイブ

Reparameterized LLM Training via Orthogonal Equivalence Transformation

投稿日: 2025年6月10日作成者: jarxiv

要約大規模な言語モデル（LLM）が人工知能の急速な進歩を推進していますが、これ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity

投稿日: 2025年6月10日作成者: jarxiv

要約この論文では、3Dシーンのジオメトリ、外観、および基礎となる物理学を純粋に … 続きを読む →

カテゴリー: cs.AI, cs.CE, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Diffusion Counterfactual Generation with Semantic Abduction

投稿日: 2025年6月10日作成者: jarxiv

要約反事実的な画像生成は、アイデンティティの保存、知覚の質の維持、根本的な因果 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, stat.ML | コメントを受け付けていません

GaussianVAE: Adaptive Learning Dynamics of 3D Gaussians for High-Fidelity Super-Resolution

投稿日: 2025年6月10日作成者: jarxiv

要約ネイティブトレーニングの解決を超えて、3Dガウススプラッティング（3DG） … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR, cs.LG | コメントを受け付けていません

Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces

投稿日: 2025年6月10日作成者: jarxiv

要約拡散モデルは、画像、ビデオ、テキスト生成など、さまざまなタスクで単峰性デー … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

RONA: Pragmatically Diverse Image Captioning with Coherence Relations

投稿日: 2025年6月10日作成者: jarxiv

要約ライティングアシスタント（Grammarly、Microsoft Copi … 続きを読む →

カテゴリー: 68T50, cs.AI, cs.CL, cs.CV, I.2.10 | コメントを受け付けていません

Mimicking or Reasoning: Rethinking Multi-Modal In-Context Learning in Vision-Language Models

投稿日: 2025年6月10日作成者: jarxiv

要約ビジョン言語モデル（VLM）は、言語のみの対応物の特性と同様の特性であるコ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features

投稿日: 2025年6月10日作成者: jarxiv

要約 LlavaやQwen-VLのような生成的大規模マルチモーダルモデル（LMM … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations

投稿日: 2025年6月10日作成者: jarxiv

要約推論セグメンテーション（RS）は、暗黙のテキストクエリに基づいてオブジェク … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

CORDIAL: Can Multimodal Large Language Models Effectively Understand Coherence Relationships?

投稿日: 2025年6月10日作成者: jarxiv

要約マルチモーダル大手言語モデル（MLLM）は、多様な問題ドメイン全体で優れた … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, I.2.10 | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

Reparameterized LLM Training via Orthogonal Equivalence Transformation

FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity

Diffusion Counterfactual Generation with Semantic Abduction

GaussianVAE: Adaptive Learning Dynamics of 3D Gaussians for High-Fidelity Super-Resolution

Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces

RONA: Pragmatically Diverse Image Captioning with Coherence Relations

Mimicking or Reasoning: Rethinking Multi-Modal In-Context Learning in Vision-Language Models

Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features

Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations

CORDIAL: Can Multimodal Large Language Models Effectively Understand Coherence Relationships?

最近の投稿

最近のコメント

アーカイブ

カテゴリー