「cs.AI」カテゴリーアーカイブ

HDLdebugger: Streamlining HDL debugging with Large Language Models

投稿日: 2024年3月19日作成者: jarxiv

要約チップ設計の領域では、ハードウェア記述言語 (HDL) が極めて重要な役割 … 続きを読む →

カテゴリー: cs.AI, cs.AR, cs.CE, cs.LG, cs.SE | コメントを受け付けていません

Is it Really Negative? Evaluating Natural Language Video Localization Performance on Multiple Reliable Videos Pool

投稿日: 2024年3月19日作成者: jarxiv

要約近年のマルチメディアコンテンツの急増に伴い、複数のビデオから特定の自然言 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

投稿日: 2024年3月19日作成者: jarxiv

要約大規模マルチモーダルモデル (LMM) の機能が進化し続けるにつれて、L … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Deep Homography Estimation for Visual Place Recognition

投稿日: 2024年3月19日作成者: jarxiv

要約視覚的場所認識 (VPR) は、ロボットの位置特定や拡張現実などの多くのア … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Hybrid Reasoning Based on Large Language Models for Autonomous Car Driving

投稿日: 2024年3月19日作成者: jarxiv

要約大規模言語モデル (LLM) は、テキストと画像を理解し、人間のようなテキ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation

投稿日: 2024年3月19日作成者: jarxiv

要約音楽生成ダンスの研究は、斬新かつ挑戦的なイメージ生成タスクです。音楽とシ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Stop Reasoning! When Multimodal LLMs with Chain-of-Thought Reasoning Meets Adversarial Images

投稿日: 2024年3月19日作成者: jarxiv

要約最近、マルチモーダル LLM (MLLM) は画像を理解する優れた能力を示 … 続きを読む →

カテゴリー: cs.AI, cs.CR, cs.CV, cs.LG | コメントを受け付けていません

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

投稿日: 2024年3月19日作成者: jarxiv

要約ビジュアルエンコーディングは、ビジュアル世界を理解する際の大規模マルチモ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models

投稿日: 2024年3月19日作成者: jarxiv

要約自然言語、プログラミングコード、数学記号の基礎となるデータ分布は大きく異 … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Towards Embedding Dynamic Personas in Interactive Robots: Masquerading Animated Social Kinematics (MASK)

投稿日: 2024年3月18日作成者: jarxiv

要約この論文では、キャラクターのようなペルソナを使用して視聴者の参加を強化する … 続きを読む →

カテゴリー: cs.AI, cs.RO | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

HDLdebugger: Streamlining HDL debugging with Large Language Models

Is it Really Negative? Evaluating Natural Language Video Localization Performance on Multiple Reliable Videos Pool

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

Deep Homography Estimation for Visual Place Recognition

Hybrid Reasoning Based on Large Language Models for Autonomous Car Driving

QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation

Stop Reasoning! When Multimodal LLMs with Chain-of-Thought Reasoning Meets Adversarial Images

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models

Towards Embedding Dynamic Personas in Interactive Robots: Masquerading Animated Social Kinematics (MASK)

最近の投稿

最近のコメント

アーカイブ

カテゴリー