Characterizing Tradeoffs in Language Model Decoding with Informational Interpretations

要約

動的計画法と情報理論を用いて言語モデルデコーダアルゴリズムを定式化するための理論的枠組みを提案します。
動的計画法を使用して、デコーダアルゴリズムの設計をロジット空間から動作状態値関数空間に引き上げ、デコードアルゴリズムが動作状態値関数の最適化の結果であることを示します。
行動状態価値関数空間の各コンポーネントには、情報理論的な解釈があります。
リフティングと解釈により、デコーダアルゴリズムが何に最適化されているかが明らかになり、合理性、多様性、帰属のトレードオフの調停が容易になります。

要約(オリジナル)

We propose a theoretical framework for formulating language model decoder algorithms with dynamic programming and information theory. With dynamic programming, we lift the design of decoder algorithms from the logit space to the action-state value function space, and show that the decoding algorithms are consequences of optimizing the action-state value functions. Each component in the action-state value function space has an information theoretical interpretation. With the lifting and interpretation, it becomes evident what the decoder algorithm is optimized for, and hence facilitating the arbitration of the tradeoffs in sensibleness, diversity, and attribution.

arxiv情報

著者	Chung-Ching Chang,William W. Cohen,Yun-Hsuan Sung
発行日	2023-11-16 18:38:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Characterizing Tradeoffs in Language Model Decoding with Informational Interpretations

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー