OLMo: Accelerating the Science of Language Models

要約

言語モデル (LM) は、NLP 研究と商用製品の両方で広く普及しています。
商業的な重要性が高まるにつれて、最も強力なモデルは閉鎖され、独自のインターフェイスの背後にゲートされ、トレーニングデータ、アーキテクチャ、開発の重要な詳細が非公開になりました。
これらのモデルを科学的に研究する際の、そのバイアスや潜在的なリスクを含む詳細の重要性を考慮すると、研究コミュニティが強力で真にオープンな LM にアクセスできることが不可欠であると私たちは考えています。
この目的を達成するために、この技術レポートでは、言語モデリングの科学を構築および研究するための最先端の真のオープン言語モデルとそのフレームワークである OLMo の最初のリリースについて詳しく説明します。
モデルの重みと推論コードのみをリリースしたこれまでの取り組みのほとんどとは異なり、OLMo と、トレーニングデータ、トレーニングおよび評価コードを含むフレームワーク全体をリリースします。
私たちは、このリリースがオープンな研究コミュニティに力を与え、強化し、イノベーションの新たな波を引き起こすことを願っています。

要約(オリジナル)

Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed. Given the importance of these details in scientifically studying these models, including their biases and potential risks, we believe it is essential for the research community to have access to powerful, truly open LMs. To this end, this technical report details the first release of OLMo, a state-of-the-art, truly Open Language Model and its framework to build and study the science of language modeling. Unlike most prior efforts that have only released model weights and inference code, we release OLMo and the whole framework, including training data and training and evaluation code. We hope this release will empower and strengthen the open research community and inspire a new wave of innovation.

arxiv情報

著者	Dirk Groeneveld,Iz Beltagy,Pete Walsh,Akshita Bhagia,Rodney Kinney,Oyvind Tafjord,Ananya Harsh Jha,Hamish Ivison,Ian Magnusson,Yizhong Wang,Shane Arora,David Atkinson,Russell Authur,Khyathi Raghavi Chandu,Arman Cohan,Jennifer Dumas,Yanai Elazar,Yuling Gu,Jack Hessel,Tushar Khot,William Merrill,Jacob Morrison,Niklas Muennighoff,Aakanksha Naik,Crystal Nam,Matthew E. Peters,Valentina Pyatkin,Abhilasha Ravichander,Dustin Schwenk,Saurabh Shah,Will Smith,Emma Strubell,Nishant Subramani,Mitchell Wortsman,Pradeep Dasigi,Nathan Lambert,Kyle Richardson,Luke Zettlemoyer,Jesse Dodge,Kyle Lo,Luca Soldaini,Noah A. Smith,Hannaneh Hajishirzi
発行日	2024-02-07 18:53:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

OLMo: Accelerating the Science of Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー