Provable optimal transport with transformers: The essence of depth and prompt engineering

要約

変圧器の証明可能な性能保証を確立できるか？このような理論的保証を確立することは、信頼できる生成AIを開発するための一里塚である。本稿では、組合せ最適化と連続最適化の交差点にある基本問題である最適輸送に注目することで、この問いに取り組む一歩を踏み出す。注意層の計算能力を活用し、固定パラメータを持つ変換器が、任意の数の点に対してエントロピー正則化を伴うWasserstein-2における最適輸送問題を効果的に解くことができることを証明する。その結果、変換器は任意のサイズのリストを近似係数まで並べ替えることができる。我々の結果は、変換器が双対最適輸送に対して適応的なステップサイズを持つ勾配降下を実装することを可能にする、設計されたプロンプトに依存している。勾配降下の収束解析をシンクホーンダイナミクスと組み合わせることで、変換器を用いた最適輸送の明示的な近似境界を確立し、この境界は深さが増すにつれて改善される。我々の発見は、最適輸送を解くための迅速工学と深さの本質に関する新しい洞察を提供する。特に、プロンプトエンジニアリングはトランスフォーマーのアルゴリズム表現力を高め、最適化手法を実装することを可能にする。深さが増すにつれて、トランスフォーマーは勾配降下の数回の繰り返しをシミュレートできるようになる。

要約(オリジナル)

Can we establish provable performance guarantees for transformers? Establishing such theoretical guarantees is a milestone in developing trustworthy generative AI. In this paper, we take a step toward addressing this question by focusing on optimal transport, a fundamental problem at the intersection of combinatorial and continuous optimization. Leveraging the computational power of attention layers, we prove that a transformer with fixed parameters can effectively solve the optimal transport problem in Wasserstein-2 with entropic regularization for an arbitrary number of points. Consequently, the transformer can sort lists of arbitrary sizes up to an approximation factor. Our results rely on an engineered prompt that enables the transformer to implement gradient descent with adaptive stepsizes on the dual optimal transport. Combining the convergence analysis of gradient descent with Sinkhorn dynamics, we establish an explicit approximation bound for optimal transport with transformers, which improves as depth increases. Our findings provide novel insights into the essence of prompt engineering and depth for solving optimal transport. In particular, prompt engineering boosts the algorithmic expressivity of transformers, allowing them implement an optimization method. With increasing depth, transformers can simulate several iterations of gradient descent.

arxiv情報

著者	Hadi Daneshmand
発行日	2024-11-01 16:54:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Provable optimal transport with transformers: The essence of depth and prompt engineering

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー