Large Language Models

要約

人工知能は目覚ましい進歩を遂げており、その最良の例の 1 つは、OpenAI の GPT シリーズなどの大規模言語モデル (LLM) の開発です。
これらの講義では、数学または物理学の背景を持つ読者向けに書かれており、簡単な歴史と最新技術の概要を説明し、基礎となる変圧器のアーキテクチャを詳細に説明します。
次に、LLM がどのように機能するか、およびテキスト内の次の単語を予測するように訓練されたモデルがどのようにインテリジェンスを示す他のタスクを実行できるかについて、いくつかの現在のアイデアを検討します。

要約(オリジナル)

Artificial intelligence is making spectacular progress, and one of the best examples is the development of large language models (LLMs) such as OpenAI’s GPT series. In these lectures, written for readers with a background in mathematics or physics, we give a brief history and survey of the state of the art, and describe the underlying transformer architecture in detail. We then explore some current ideas on how LLMs work and how models trained to predict the next word in a text are able to perform other tasks displaying intelligence.

arxiv情報

著者	Michael R. Douglas
発行日	2023-07-11 20:21:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Large Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー