A Dataless Reinforcement Learning Approach to Rounding Hyperplane Optimization for Max-Cut

要約

最大カット（maxcut）の問題はNP不完全であり、その最適なソリューションを取得することは、最悪の場合はNPハードです。
その結果、ヒューリスティックベースのアルゴリズムが一般的に使用されていますが、その設計には重要なドメインの専門知識が必要になることがよくあります。
最近では、大きな（UN）ラベルのあるデータセットでトレーニングされた学習ベースの方法が提案されています。
ただし、これらのアプローチは、多くの場合、一般化可能性とスケーラビリティに苦労しています。
Maxcutのよく知られている近似アルゴリズムは、Goemans-Williamson（GW）アルゴリズムです。これは、2次非制約のバイナリ最適化（QUBO）の定式化をセミデフィニットプログラム（SDP）に緩和します。
GWアルゴリズムは、ランダムハイパープレーンを均一にサンプリングしてSDPソリューションをバイナリノード割り当てに変換することにより、ハイパープレーンの丸めを適用します。
このホワイトペーパーでは、エージェントがGWアルゴリズムによって生成されたものよりも優れたカットを生成する改善された丸めハイパープレーンを選択することを学ぶことを学ぶことを学ぶ、エピソードではない補強学習の定式化に基づいたトレーニング-DATAフリーアプローチを提案します。
マルコフ決定プロセス（MDP）を最適化することにより、我々の方法は、さまざまな密度と程度分布を備えた大規模なグラフでより良いカットを一貫して達成します。

要約(オリジナル)

The Maximum Cut (MaxCut) problem is NP-Complete, and obtaining its optimal solution is NP-hard in the worst case. As a result, heuristic-based algorithms are commonly used, though their design often requires significant domain expertise. More recently, learning-based methods trained on large (un)labeled datasets have been proposed; however, these approaches often struggle with generalizability and scalability. A well-known approximation algorithm for MaxCut is the Goemans-Williamson (GW) algorithm, which relaxes the Quadratic Unconstrained Binary Optimization (QUBO) formulation into a semidefinite program (SDP). The GW algorithm then applies hyperplane rounding by uniformly sampling a random hyperplane to convert the SDP solution into binary node assignments. In this paper, we propose a training-data-free approach based on a non-episodic reinforcement learning formulation, in which an agent learns to select improved rounding hyperplanes that yield better cuts than those produced by the GW algorithm. By optimizing over a Markov Decision Process (MDP), our method consistently achieves better cuts across large-scale graphs with varying densities and degree distributions.

arxiv情報

著者	Gabriel Malikal,Ismail Alkhouri,Alvaro Velasquez,Adam M Alessio,Saiprasad Ravishankar
発行日	2025-05-20 03:31:21+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Dataless Reinforcement Learning Approach to Rounding Hyperplane Optimization for Max-Cut

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー