月別アーカイブ: 2025年2月

Holistically Guided Monte Carlo Tree Search for Intricate Information Seeking

投稿日: 2025年2月10日作成者: jarxiv

要約膨大なデジタル情報の時代において、利用可能な情報の膨大な量と不均一性は、複 … 続きを読む →

カテゴリー: cs.CL, cs.IR | コメントを受け付けていません

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

投稿日: 2025年2月10日作成者: jarxiv

要約このペーパーでは、UCFE：ユーザー中心の金融専門知識ベンチマークを紹介し … 続きを読む →

カテゴリー: cs.CE, cs.CL, q-fin.CP | コメントを受け付けていません

Concept Navigation and Classification via Open Source Large Language Model Processing

投稿日: 2025年2月10日作成者: jarxiv

要約このペーパーでは、オープンソースの大手言語モデル（LLM）を使用したテキス … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG, I.2.7 | コメントを受け付けていません

ELITE: Enhanced Language-Image Toxicity Evaluation for Safety

投稿日: 2025年2月10日作成者: jarxiv

要約現在のビジョン言語モデル（VLM）は、有害な出力を誘導する悪意のあるプロン … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

SeDi-Instruct: Enhancing Alignment of Language Models through Self-Directed Instruction Generation

投稿日: 2025年2月10日作成者: jarxiv

要約大規模な言語モデル（LLMS）の急速な進化により、業界はさまざまなAIベー … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Probing Internal Representations of Multi-Word Verbs in Large Language Models

投稿日: 2025年2月10日作成者: jarxiv

要約この研究では、変圧器ベースの大手言語モデル（LLM）内のマルチワード動詞と … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

S$^2$-MAD: Breaking the Token Barrier to Enhance Multi-Agent Debate Efficiency

投稿日: 2025年2月10日作成者: jarxiv

要約大規模な言語モデル（LLMS）は、さまざまな自然言語処理（NLP）シナリオ … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Developmentally-plausible Working Memory Shapes a Critical Period for Language Acquisition

投稿日: 2025年2月10日作成者: jarxiv

要約大規模な言語モデルは一般的な言語能力を示しますが、言語習得の効率が人間とは … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks

投稿日: 2025年2月10日作成者: jarxiv

要約フリーテキストの説明は表現力豊かで理解しやすいですが、多くのデータセットに … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

CASE-Bench: Context-Aware SafEty Benchmark for Large Language Models

投稿日: 2025年2月10日作成者: jarxiv

要約大規模な言語モデル（LLM）を人間の価値観に合わせて、安全な展開と広範な採 … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

月別アーカイブ: 2025年2月

Holistically Guided Monte Carlo Tree Search for Intricate Information Seeking

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Concept Navigation and Classification via Open Source Large Language Model Processing

ELITE: Enhanced Language-Image Toxicity Evaluation for Safety

SeDi-Instruct: Enhancing Alignment of Language Models through Self-Directed Instruction Generation

Probing Internal Representations of Multi-Word Verbs in Large Language Models

S$^2$-MAD: Breaking the Token Barrier to Enhance Multi-Agent Debate Efficiency

Developmentally-plausible Working Memory Shapes a Critical Period for Language Acquisition

Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks

CASE-Bench: Context-Aware SafEty Benchmark for Large Language Models

最近の投稿

最近のコメント

アーカイブ

カテゴリー