「cs.AI」カテゴリーアーカイブ

T-REG: Preference Optimization with Token-Level Reward Regularization

投稿日: 2024年12月4日作成者: jarxiv

要約人間のフィードバックからの強化学習（RLHF）は、大規模言語モデル（LLM … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

Scaling BERT Models for Turkish Automatic Punctuation and Capitalization Correction

投稿日: 2024年12月4日作成者: jarxiv

要約本論文では、トルコ語のテキストにおける句読点と大文字の自動修正のための B … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

BYE: Build Your Encoder with One Sequence of Exploration Data for Long-Term Dynamic Scene Understanding

投稿日: 2024年12月4日作成者: jarxiv

要約ロボットアプリケーションにおいて、動的なシーン理解は依然として根強い課題で … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations

投稿日: 2024年12月4日作成者: jarxiv

要約ディープラーニングの台頭により、顔認識技術は広範な研究と急速な発展を遂げて … 続きを読む →

カテゴリー: cs.AI, cs.CR, cs.CV, cs.LG | コメントを受け付けていません

Towards Rich Emotions in 3D Avatars: A Text-to-3D Avatar Generation Benchmark

投稿日: 2024年12月4日作成者: jarxiv

要約話し言葉に由来するテキストを用いた感情的でダイナミックな3D顔アバター（E … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification

投稿日: 2024年12月4日作成者: jarxiv

要約マルチモーダル大規模言語モデル(MLLM)は、視覚理解、推論、インタラクシ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

WEM-GAN: Wavelet transform based facial expression manipulation

投稿日: 2024年12月4日作成者: jarxiv

要約表情操作は、顔認識に影響を与えることなく人間の表情を変化させることを目的と … 続きを読む →

カテゴリー: cs.AI, cs.CV, eess.IV | コメントを受け付けていません

Segmentation of Coronary Artery Stenosis in X-ray Angiography using Mamba Models

投稿日: 2024年12月4日作成者: jarxiv

要約冠動脈疾患は世界的な死亡率の主な要因の1つである。X線画像から冠動脈狭窄を … 続きを読む →

カテゴリー: cs.AI, cs.CV, eess.IV | コメントを受け付けていません

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

投稿日: 2024年12月4日作成者: jarxiv

要約近年、GPT-4o、Gemini 1.5 Pro、Reka Coreなどの … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.MM, cs.SD, eess.AS | コメントを受け付けていません

Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback

投稿日: 2024年12月4日作成者: jarxiv

要約大規模なテキストからビデオへのモデルは、幅広い下流アプリケーションに計り知 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

T-REG: Preference Optimization with Token-Level Reward Regularization

Scaling BERT Models for Turkish Automatic Punctuation and Capitalization Correction

BYE: Build Your Encoder with One Sequence of Exploration Data for Long-Term Dynamic Scene Understanding

OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations

Towards Rich Emotions in 3D Avatars: A Text-to-3D Avatar Generation Benchmark

Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification

WEM-GAN: Wavelet transform based facial expression manipulation

Segmentation of Coronary Artery Stenosis in X-ray Angiography using Mamba Models

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback

最近の投稿

最近のコメント

アーカイブ

カテゴリー