投稿者「jarxiv」のアーカイブ

Large Language Models Reflect the Ideology of their Creators

投稿日: 2025年1月31日作成者: jarxiv

要約大規模な言語モデル（LLM）は、自然言語を生成するために膨大な量のデータで … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering

投稿日: 2025年1月31日作成者: jarxiv

要約大規模な言語モデル（LLM）は、言語固有の文化的知識と一般的な知識の両方を … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking

投稿日: 2025年1月31日作成者: jarxiv

要約安全アライメントメカニズムは、大規模な言語モデル（LLM）が有害な情報や非 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

LLM-AutoDiff: Auto-Differentiate Any LLM Workflow

投稿日: 2025年1月31日作成者: jarxiv

要約大規模な言語モデル（LLMS）は、自然言語処理、マルチホップの回収からのア … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training

投稿日: 2025年1月31日作成者: jarxiv

要約 DPOから蒸留まで、訓練後の言語モデル（LLM）は、行動を改良し、新しいス … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

投稿日: 2025年1月31日作成者: jarxiv

要約大規模な言語モデル（LLM）のトレーニングは、通常、トレーニング時間を短縮 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering

投稿日: 2025年1月31日作成者: jarxiv

要約検索された生成（RAG）は、プライベートおよび最新の知識ベースとともに、大 … 続きを読む →

カテゴリー: cs.CL, I.2.7 | コメントを受け付けていません

Differentially Private Steering for Large Language Model Alignment

投稿日: 2025年1月31日作成者: jarxiv

要約大規模な言語モデル（LLM）を人間の価値観に合わせて、望ましくない行動（幻 … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics

投稿日: 2025年1月31日作成者: jarxiv

要約大規模な言語モデルの改善により、自然言語生成出力の信頼できる評価者として役 … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

投稿日: 2025年1月31日作成者: jarxiv

要約 OpenaiのO1などの大規模な言語モデル（LLM）は、テスト時間の計算を … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

Large Language Models Reflect the Ideology of their Creators

CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering

xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking

LLM-AutoDiff: Auto-Differentiate Any LLM Workflow

WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering

Differentially Private Steering for Large Language Model Alignment

Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

最近の投稿

最近のコメント

アーカイブ

カテゴリー