「cs.CL」カテゴリーアーカイブ

SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction

投稿日: 2024年10月18日作成者: jarxiv

要約大規模言語モデル (LLM) の最近の進歩により、長いコンテキストを処理で … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Towards Multilingual LLM Evaluation for European Languages

投稿日: 2024年10月18日作成者: jarxiv

要約大規模言語モデル (LLM) の台頭により、多数の言語やタスクにわたって自 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs

投稿日: 2024年10月18日作成者: jarxiv

要約 Transformer ベースの大規模言語モデル (LLM) はさまざまな … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, stat.ML | コメントを受け付けていません

H2OVL-Mississippi Vision Language Models Technical Report

投稿日: 2024年10月18日作成者: jarxiv

要約小型ビジョン言語モデル (VLM) は、企業の商業文書や画像を処理するため … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Granular Privacy Control for Geolocation with Vision Language Models

投稿日: 2024年10月18日作成者: jarxiv

要約ビジョン言語モデル (VLM) は、情報を求める質問に答える機能が急速に進 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks

投稿日: 2024年10月18日作成者: jarxiv

要約異種入力 (画像、テキスト、音声など) から推論を導き出すことは、人間が日 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Pose-Based Sign Language Appearance Transfer

投稿日: 2024年10月18日作成者: jarxiv

要約手話の内容を保持したまま、手話の骨格ポーズで署名者の外観を転送する方法を紹 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Beyond Coarse-Grained Matching in Video-Text Retrieval

投稿日: 2024年10月18日作成者: jarxiv

要約ビデオテキストの検索は大幅に進歩しましたが、キャプションの微妙な違いを識別 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.MM | コメントを受け付けていません

Exploring the Design Space of Visual Context Representation in Video MLLMs

投稿日: 2024年10月18日作成者: jarxiv

要約ビデオマルチモーダル大規模言語モデル (MLLM) は、さまざまな下流タ … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Harnessing Webpage UIs for Text-Rich Visual Understanding

投稿日: 2024年10月18日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) が構造化環境と効果的に対話す … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

「cs.CL」カテゴリーアーカイブ

SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction

Towards Multilingual LLM Evaluation for European Languages

How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs

H2OVL-Mississippi Vision Language Models Technical Report

Granular Privacy Control for Geolocation with Vision Language Models

VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks

Pose-Based Sign Language Appearance Transfer

Beyond Coarse-Grained Matching in Video-Text Retrieval

Exploring the Design Space of Visual Context Representation in Video MLLMs

Harnessing Webpage UIs for Text-Rich Visual Understanding

最近の投稿

最近のコメント

アーカイブ

カテゴリー