月別アーカイブ: 2024年2月

Faithful Temporal Question Answering over Heterogeneous Sources

投稿日: 2024年2月26日作成者: jarxiv

要約時間的質問応答 (QA) には、「… 2019 年に」や「… 新型コロナウ … 続きを読む →

カテゴリー: cs.CL, cs.IR | コメントを受け付けていません

The Challenges of Machine Learning for Trust and Safety: A Case Study on Misinformation Detection

投稿日: 2024年2月26日作成者: jarxiv

要約私たちは、ケーススタディとして誤った情報の検出を使用して、機械学習を信頼と … 続きを読む →

カテゴリー: cs.CL, cs.CY, cs.LG | コメントを受け付けていません

Towards Efficient and Exact Optimization of Language Model Alignment

投稿日: 2024年2月26日作成者: jarxiv

要約言語モデルを人間の好みに合わせることは、現実世界のタスクに言語モデルを適用 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning

投稿日: 2024年2月26日作成者: jarxiv

要約好みに基づく強化学習 (RL) は、ロボット学習の新しい分野として登場しま … 続きを読む →

カテゴリー: cs.CL, cs.LG, cs.RO | コメントを受け付けていません

Distilled Self-Critique of LLMs with Synthetic Data: a Bayesian Perspective

投稿日: 2024年2月26日作成者: jarxiv

要約この論文では、蒸留された自己批判 (dSC) を導入することにより、RLA … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Repetition Improves Language Model Embeddings

投稿日: 2024年2月26日作成者: jarxiv

要約自己回帰大規模言語モデル (LLM) からのテキスト埋め込みの抽出を改善す … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning

投稿日: 2024年2月26日作成者: jarxiv

要約大規模言語モデル (LLM) は、質問に答える前に段階的に推論するよう求め … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization

投稿日: 2024年2月26日作成者: jarxiv

要約人間のフィードバックからの強化学習 (RLHF) は、言語モデル (LM) … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models

投稿日: 2024年2月26日作成者: jarxiv

要約大規模言語モデル (LLM) の社会業務への統合が進むにつれて、経済、法律 … 続きを読む →

カテゴリー: cs.CL, cs.CY | コメントを受け付けていません

Chain of Logic: Rule-Based Reasoning with Large Language Models

投稿日: 2024年2月26日作成者: jarxiv

要約法的推論の基本的なタイプであるルールに基づく推論では、一連の事実にルールを … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

月別アーカイブ: 2024年2月

Faithful Temporal Question Answering over Heterogeneous Sources

The Challenges of Machine Learning for Trust and Safety: A Case Study on Misinformation Detection

Towards Efficient and Exact Optimization of Language Model Alignment

PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning

Distilled Self-Critique of LLMs with Synthetic Data: a Bayesian Perspective

Repetition Improves Language Model Embeddings

Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning

Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization

Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models

Chain of Logic: Rule-Based Reasoning with Large Language Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー