月別アーカイブ: 2025年2月

HiddenDetect: Detecting Jailbreak Attacks against Large Vision-Language Models via Monitoring Hidden States

投稿日: 2025年2月21日作成者: jarxiv

要約追加のモダリティを統合すると、言語のみの対応物と比較して、脱獄攻撃などの安 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Large Language Models Struggle to Describe the Haystack without Human Help: Human-in-the-loop Evaluation of LLMs

投稿日: 2025年2月21日作成者: jarxiv

要約 NLPの一般的な使用は、従来のトピックモデルの使用から大規模な言語モデルに … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators

投稿日: 2025年2月21日作成者: jarxiv

要約効率的なGPUカーネルを構築するために設計された高レベルのPythonのよ … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

SurveyX: Academic Survey Automation via Large Language Models

投稿日: 2025年2月21日作成者: jarxiv

要約大規模な言語モデル（LLMS）は、例外的な理解能力と膨大な知識ベースを実証 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations

投稿日: 2025年2月21日作成者: jarxiv

要約マルチモーダルファンデーションモデルは、言語の構文やモダリティの違いなどの … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps

投稿日: 2025年2月21日作成者: jarxiv

要約段階的に考えるように促されると、言語モデル（LMS）は、モデルが予測を生成 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

GATE: Graph-based Adaptive Tool Evolution Across Diverse Tasks

投稿日: 2025年2月21日作成者: jarxiv

要約大規模な言語モデル（LLM）はツール作成に大きな期待を示していますが、既存 … 続きを読む →

カテゴリー: 68T50, cs.CL, I.2.7 | コメントを受け付けていません

CLIPPER: Compression enables long-context synthetic data generation

投稿日: 2025年2月21日作成者: jarxiv

要約 LLM開発者は合成データにますます依存していますが、複雑な長いコンテストの … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Prompt-to-Leaderboard

投稿日: 2025年2月21日作成者: jarxiv

要約大規模な言語モデル（LLM）評価は、通常、精度や人間の好みなどの集計された … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning

投稿日: 2025年2月21日作成者: jarxiv

要約大規模な言語モデル（LLM）は、不確実性の下で効果的な質問をすることができ … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

月別アーカイブ: 2025年2月

HiddenDetect: Detecting Jailbreak Attacks against Large Vision-Language Models via Monitoring Hidden States

Large Language Models Struggle to Describe the Haystack without Human Help: Human-in-the-loop Evaluation of LLMs

TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators

SurveyX: Academic Survey Automation via Large Language Models

How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations

Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps

GATE: Graph-based Adaptive Tool Evolution Across Diverse Tasks

CLIPPER: Compression enables long-context synthetic data generation

Prompt-to-Leaderboard

Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning

最近の投稿

最近のコメント

アーカイブ

カテゴリー