DeepSeek-Coder: When the Large Language Model Meets Programming — The Rise of Code Intelligence

要約

大規模な言語モデルの急速な開発により、ソフトウェア開発におけるコードインテリジェンスに革命が起きました。
ただし、クローズドソースモデルが優勢であるため、広範な研究開発が制限されています。
これに対処するために、2 兆のトークンでゼロからトレーニングされた、1.3B から 33B までのサイズの一連のオープンソースコードモデルである DeepSeek-Coder シリーズを導入します。
これらのモデルは、高品質のプロジェクトレベルのコードコーパスで事前トレーニングされており、16K ウィンドウの空白埋めタスクを採用してコードの生成と埋め込みを強化します。
私たちの広範な評価により、DeepSeek-Coder が複数のベンチマークにわたるオープンソースコードモデルの中で最先端のパフォーマンスを達成するだけでなく、Codex や GPT-3.5 などの既存のクローズドソースモデルをも上回ることが実証されました。
さらに、DeepSeek-Coder モデルは、研究と無制限の商用利用の両方を許可する寛容なライセンスの下にあります。

要約(オリジナル)

The rapid development of large language models has revolutionized code intelligence in software development. However, the predominance of closed-source models has restricted extensive research and development. To address this, we introduce the DeepSeek-Coder series, a range of open-source code models with sizes from 1.3B to 33B, trained from scratch on 2 trillion tokens. These models are pre-trained on a high-quality project-level code corpus and employ a fill-in-the-blank task with a 16K window to enhance code generation and infilling. Our extensive evaluations demonstrate that DeepSeek-Coder not only achieves state-of-the-art performance among open-source code models across multiple benchmarks but also surpasses existing closed-source models like Codex and GPT-3.5. Furthermore, DeepSeek-Coder models are under a permissive license that allows for both research and unrestricted commercial use.

arxiv情報

著者	Daya Guo,Qihao Zhu,Dejian Yang,Zhenda Xie,Kai Dong,Wentao Zhang,Guanting Chen,Xiao Bi,Y. Wu,Y. K. Li,Fuli Luo,Yingfei Xiong,Wenfeng Liang
発行日	2024-01-25 14:17:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

DeepSeek-Coder: When the Large Language Model Meets Programming — The Rise of Code Intelligence

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー