月別アーカイブ: 2024年6月

LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks

投稿日: 2024年6月14日作成者: jarxiv

要約自己教師あり学習 (SSL) ベースの音声モデルは、フルスタックの音声処理 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Diffusion Gaussian Mixture Audio Denoise

投稿日: 2024年6月14日作成者: jarxiv

要約最近の拡散モデルは、オーディオのノイズ除去タスクにおいて有望なパフォーマン … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation

投稿日: 2024年6月14日作成者: jarxiv

要約大規模言語モデル (LLM) は驚くべき機能を実証し、日常生活のアプリケー … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning

投稿日: 2024年6月14日作成者: jarxiv

要約大規模言語モデル (LLM) は、優れた推論能力を示していますが、特に複雑 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

ReMI: A Dataset for Reasoning with Multiple Images

投稿日: 2024年6月14日作成者: jarxiv

要約大規模言語モデル (LLM) は継続的に進歩しているため、その拡張機能を効 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Orthogonality and isotropy of speaker and phonetic information in self-supervised speech representations

投稿日: 2024年6月14日作成者: jarxiv

要約自己教師付き音声表現は、下流の音声技術に多大な利益をもたらしますが、それを … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn’t

投稿日: 2024年6月14日作成者: jarxiv

要約どのような言語的要因が自動音声認識 (ASR) モデルのパフォーマンスに影 … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

投稿日: 2024年6月14日作成者: jarxiv

要約 MMMU を紹介します。MMMU は、大学レベルの主題知識と慎重な推論を必 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

ReadCtrl: Personalizing text generation with readability-controlled instruction learning

投稿日: 2024年6月14日作成者: jarxiv

要約ユーザーの読みやすさを条件にしたコンテンツ生成は、パーソナライゼーションの … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Active Learning for Multilingual Fingerspelling Corpora

投稿日: 2024年6月14日作成者: jarxiv

要約私たちはアクティブラーニングを適用して、手話のデータ不足の問題を解決しま … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年6月

LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks

Diffusion Gaussian Mixture Audio Denoise

DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation

Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning

ReMI: A Dataset for Reasoning with Multiple Images

Orthogonality and isotropy of speaker and phonetic information in self-supervised speech representations

Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn’t

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

ReadCtrl: Personalizing text generation with readability-controlled instruction learning

Active Learning for Multilingual Fingerspelling Corpora

最近の投稿

最近のコメント

アーカイブ

カテゴリー