月別アーカイブ: 2025年1月

Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph

投稿日: 2025年1月13日作成者: jarxiv

要約大規模言語モデル (LLM) の急速な普及により、研究者は LLM 幻覚や … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

ConSim: Measuring Concept-Based Explanations’ Effectiveness with Automated Simulatability

投稿日: 2025年1月13日作成者: jarxiv

要約概念ベースの説明は、複雑なモデルの計算を人間が理解できる概念にマッピングす … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Paraphrase Types Elicit Prompt Engineering Capabilities

投稿日: 2025年1月13日作成者: jarxiv

要約最新の言語モデルの成功の多くは、モデルに指示するための適切なプロンプトを見 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction

投稿日: 2025年1月13日作成者: jarxiv

要約将来のイベントを予測することは、複数の分野やドメインにわたるアプリケーショ … 続きを読む →

カテゴリー: cs.CL, cs.IR | コメントを受け付けていません

LLMs Reproduce Stereotypes of Sexual and Gender Minorities

投稿日: 2025年1月13日作成者: jarxiv

要約多くの研究により、NLP システムにおける重大なジェンダーバイアスが判明し … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Universal-2-TF: Robust All-Neural Text Formatting for ASR

投稿日: 2025年1月13日作成者: jarxiv

要約このペーパーでは、句読点復元 (PR)、トゥルーケーシング、および逆テキス … 続きを読む →

カテゴリー: cs.CL, I.2.7 | コメントを受け付けていません

Can Many-Shot In-Context Learning Help LLMs as Evaluators? A Preliminary Empirical Study

投稿日: 2025年1月13日作成者: jarxiv

要約大規模言語モデル (LLM) のパフォーマンスを評価するための評価者として … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Finnish SQuAD: A Simple Approach to Machine Translation of Span Annotations

投稿日: 2025年1月13日作成者: jarxiv

要約 DeepL MT サービスとそのフォーマット済みドキュメントの翻訳機能を使 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Towards Early Prediction of Self-Supervised Speech Model Performance

投稿日: 2025年1月13日作成者: jarxiv

要約自己教師あり学習 (SSL) では、事前トレーニングと評価にリソースが大量 … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.SD, eess.AS | コメントを受け付けていません

Hermit Kingdom Through the Lens of Multiple Perspectives: A Case Study of LLM Hallucination on North Korea

投稿日: 2025年1月13日作成者: jarxiv

要約大規模言語モデル (LLM) における幻覚は、特に誤った情報を広める可能性 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

月別アーカイブ: 2025年1月

Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph

ConSim: Measuring Concept-Based Explanations’ Effectiveness with Automated Simulatability

Paraphrase Types Elicit Prompt Engineering Capabilities

Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction

LLMs Reproduce Stereotypes of Sexual and Gender Minorities

Universal-2-TF: Robust All-Neural Text Formatting for ASR

Can Many-Shot In-Context Learning Help LLMs as Evaluators? A Preliminary Empirical Study

Finnish SQuAD: A Simple Approach to Machine Translation of Span Annotations

Towards Early Prediction of Self-Supervised Speech Model Performance

Hermit Kingdom Through the Lens of Multiple Perspectives: A Case Study of LLM Hallucination on North Korea

最近の投稿

最近のコメント

アーカイブ

カテゴリー