Typhoon 2: A Family of Open Text and Multimodal Thai Large Language Models

要約

このペーパーでは、タイ語用に最適化された一連のテキストおよびマルチモーダル大規模言語モデルであるTyphoon 2を紹介します。
このシリーズには、テキスト、ビジョン、オーディオのモデルが含まれています。
タイフーン 2-テキストは、Llama 3 や Qwen2 などの最先端のオープンモデルに基づいて構築されており、英語とタイ語のデータを組み合わせた継続的な事前トレーニングを実行します。
基本モデルの元の機能を維持しながら、タイ語のパフォーマンスを向上させるために、トレーニング後のテクニックを採用しています。
当社は、10 億から 700 億パラメータまでのさまざまなサイズのテキストモデルをリリースしており、ベースバリアントと命令調整バリアントの両方で利用できます。
ガードレールテキスト生成のために、タイの文化と言語に合わせて強化された分類子、Typhoon2-Safety をリリースします。
タイフーン2-ビジョンは、画像キャプションなどの一般的な視覚機能を維持しながら、タイ語文書の理解を向上させます。
タイフーン 2-オーディオは、オーディオ、音声、およびテキスト入力を処理し、テキストと音声の両方の出力を生成できる、エンドツーエンドの音声対音声モデルアーキテクチャを導入します。

要約(オリジナル)

This paper introduces Typhoon 2, a series of text and multimodal large language models optimized for the Thai language. The series includes models for text, vision, and audio. Typhoon2-Text builds on state-of-the-art open models, such as Llama 3 and Qwen2, and we perform continual pre-training on a mixture of English and Thai data. We employ post-training techniques to enhance Thai language performance while preserving the base models’ original capabilities. We release text models across a range of sizes, from 1 to 70 billion parameters, available in both base and instruction-tuned variants. To guardrail text generation, we release Typhoon2-Safety, a classifier enhanced for Thai cultures and language. Typhoon2-Vision improves Thai document understanding while retaining general visual capabilities, such as image captioning. Typhoon2-Audio introduces an end-to-end speech-to-speech model architecture capable of processing audio, speech, and text inputs and generating both text and speech outputs.

arxiv情報

著者	Kunat Pipatanakul,Potsawee Manakul,Natapong Nitarach,Warit Sirichotedumrong,Surapon Nonesung,Teetouch Jaknamon,Parinthapat Pengpun,Pittawat Taveekitworachai,Adisai Na-Thalang,Sittipong Sripaisarnmongkol,Krisanapong Jirayoot,Kasima Tharnpipitchai
発行日	2024-12-19 17:36:38+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Typhoon 2: A Family of Open Text and Multimodal Thai Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー