「cs.AI」カテゴリーアーカイブ

Reanimating Images using Neural Representations of Dynamic Stimuli

投稿日: 2025年3月26日作成者: jarxiv

要約コンピュータービジョンモデルは静的な画像認識で信じられないほどの進歩を遂げ … 続きを読む →

カテゴリー: cs.AI, cs.CV, q-bio.NC | コメントを受け付けていません

Structuring Scientific Innovation: A Framework for Modeling and Discovering Impactful Knowledge Combinations

投稿日: 2025年3月26日作成者: jarxiv

要約大規模な言語モデルの出現は、科学的知識の構造化された探求のための新しい可能 … 続きを読む →

カテゴリー: cs.AI | コメントを受け付けていません

MC-LLaVA: Multi-Concept Personalized Vision-Language Model

投稿日: 2025年3月26日作成者: jarxiv

要約現在のビジョン言語モデル（VLM）は、視覚的な質問応答など、さまざまなタス … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Aether: Geometric-Aware Unified World Modeling

投稿日: 2025年3月26日作成者: jarxiv

要約幾何学的再構築と生成モデリングの統合は、人間のような空間的推論が可能なAI … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Any6D: Model-free 6D Pose Estimation of Novel Objects

投稿日: 2025年3月26日作成者: jarxiv

要約 6Dオブジェクトポーズ推定のモデルフリーフレームワークであるAny6Dを紹 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

Lightweight Embedded FPGA Deployment of Learned Image Compression with Knowledge Distillation and Hybrid Quantization

投稿日: 2025年3月26日作成者: jarxiv

要約学習可能な画像圧縮（LIC）は、RD効率で標準化されたビデオコーデックを上 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Commander-GPT: Fully Unleashing the Sarcasm Detection Capability of Multi-Modal Large Language Models

投稿日: 2025年3月26日作成者: jarxiv

要約自然言語処理の分野（NLP）の重要な研究方向としての皮肉検出は、広範囲にわ … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Frequency Dynamic Convolution for Dense Image Prediction

投稿日: 2025年3月26日作成者: jarxiv

要約動的畳み込み（DY-CONV）は、注意メカニズムと組み合わせた複数の並列重 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Latent Embedding Adaptation for Human Preference Alignment in Diffusion Planners

投稿日: 2025年3月25日作成者: jarxiv

要約この作業は、個々のユーザーの好みに迅速に適応できるリソース効率の良いアプロ … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

Humanoid Policy ~ Human Policy

投稿日: 2025年3月25日作成者: jarxiv

要約さまざまなデータを使用したヒューマノイドロボットのトレーニング操作ポリシー … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

Reanimating Images using Neural Representations of Dynamic Stimuli

Structuring Scientific Innovation: A Framework for Modeling and Discovering Impactful Knowledge Combinations

MC-LLaVA: Multi-Concept Personalized Vision-Language Model

Aether: Geometric-Aware Unified World Modeling

Any6D: Model-free 6D Pose Estimation of Novel Objects

Lightweight Embedded FPGA Deployment of Learned Image Compression with Knowledge Distillation and Hybrid Quantization

Commander-GPT: Fully Unleashing the Sarcasm Detection Capability of Multi-Modal Large Language Models

Frequency Dynamic Convolution for Dense Image Prediction

Latent Embedding Adaptation for Human Preference Alignment in Diffusion Planners

Humanoid Policy ~ Human Policy

最近の投稿

最近のコメント

アーカイブ

カテゴリー