月別アーカイブ: 2025年1月

Towards Expressive Video Dubbing with Multiscale Multimodal Context Interaction

投稿日: 2025年1月3日作成者: jarxiv

要約自動ビデオダビング (AVD) は、スクリプトから唇の動きと顔の感情に合 … 続きを読む →

カテゴリー: cs.CL, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

AnglE-optimized Text Embeddings

投稿日: 2025年1月3日作成者: jarxiv

要約高品質のテキスト埋め込みは、大規模言語モデル (LLM) アプリケーション … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Text2midi: Generating Symbolic Music from Captions

投稿日: 2025年1月3日作成者: jarxiv

要約この文書では、テキスト記述から MIDI ファイルを生成するエンドツーエン … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.SD, eess.AS | コメントを受け付けていません

DiSHA: Dimension-Sharding Adaptation with Fast Convergence and Fast Computation

投稿日: 2025年1月3日作成者: jarxiv

要約低ランク適応 (LoRA) は、大規模言語モデル (LLM) の重み更新の … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks

投稿日: 2025年1月3日作成者: jarxiv

要約多言語データで事前トレーニングされた大規模言語モデル (LLM) は、言語 … 続きを読む →

カテゴリー: cs.AI, cs.CL, I.2.7 | コメントを受け付けていません

Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning

投稿日: 2025年1月3日作成者: jarxiv

要約大規模音声言語モデル (LALM) の最近の進歩により、音声および音声情報 … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

MaLei at the PLABA Track of TAC-2024: RoBERTa for Task 1 — LLaMA3.1 and GPT-4o for Task 2

投稿日: 2025年1月3日作成者: jarxiv

要約このレポートは、共有タスク「Plain Language Adaptati … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

An investigation of phrase break prediction in an End-to-End TTS system

投稿日: 2025年1月3日作成者: jarxiv

要約目的: この研究では、エンドツーエンドの Text-to-Speech ( … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

UPCS: Unbiased Persona Construction for Dialogue Generation

投稿日: 2025年1月3日作成者: jarxiv

要約対話システムやストーリーテリングシステムなどのナラティブシステムでは、 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Mathematical Language Models: A Survey

投稿日: 2025年1月3日作成者: jarxiv

要約近年、数学の領域内で、事前トレーニング済み言語モデル (PLM) と大規模 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

月別アーカイブ: 2025年1月

Towards Expressive Video Dubbing with Multiscale Multimodal Context Interaction

AnglE-optimized Text Embeddings

Text2midi: Generating Symbolic Music from Captions

DiSHA: Dimension-Sharding Adaptation with Fast Convergence and Fast Computation

Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks

Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning

MaLei at the PLABA Track of TAC-2024: RoBERTa for Task 1 — LLaMA3.1 and GPT-4o for Task 2

An investigation of phrase break prediction in an End-to-End TTS system

UPCS: Unbiased Persona Construction for Dialogue Generation

Mathematical Language Models: A Survey

最近の投稿

最近のコメント

アーカイブ

カテゴリー