The Backfiring Effect of Weak AI Safety Regulation

要約

最近の政策提案は、一般的なAIの安全性を改善することを目的としていますが、AIの安全に対するさまざまな規制アプローチの有効性についてはほとんど理解されていません。
安全規制、汎用AIクリエイター、およびドメインスペシャリストの間の相互作用を調査する戦略的モデルを提示します。
私たちの分析では、AI開発チェーンのさまざまな部分をターゲットにしたさまざまな規制対策が、このゲームの結果にどのように影響するかを調べます。
特に、AIテクノロジーは、安全性とパフォーマンスという2つの重要な属性によって特徴付けられると仮定します。
規制当局は、最初に、一方または両方のプレーヤーに適用される最小安全基準を設定し、非遵守に対する厳格な罰則を科します。
その後、汎用の作成者はテクノロジーに投資し、初期の安全性とパフォーマンスレベルを確立します。
次に、ドメインの専門家は、特定のユースケースのAIを改良し、安全性とパフォーマンスレベルを更新し、製品を市場に投入します。
結果としての収益は、収益分配パラメーターを通じて専門家とジェネラリストの間に分配されます。
私たちの分析では、2つの重要な洞察が明らかになりました。最初に、主にドメインの専門家に課される弱い安全規制が裏目に出ることができます。
AIのユースケースを規制することは論理的に思えるかもしれませんが、私たちの分析は、ドメインのスペシャリストだけをターゲットにした弱い規制だけでは意図せずに安全性を低下させる可能性があることを示しています。
この効果は、幅広い設定にわたって持続します。
第二に、以前の発見とは対照的に、より強く、適切に配置された規制が実際にそれにさらされたすべてのプレーヤーに相互に利益をもたらすことができることを観察します。
規制当局が汎用AIクリエイターとドメインの専門家の両方に適切な安全基準を課す場合、規制はコミットメントデバイスとして機能し、安全性とパフォーマンスの向上につながり、1人のプレーヤーだけを調整したり、規制したりすることを上回ります。

要約(オリジナル)

Recent policy proposals aim to improve the safety of general-purpose AI, but there is little understanding of the efficacy of different regulatory approaches to AI safety. We present a strategic model that explores the interactions between safety regulation, the general-purpose AI creators, and domain specialists–those who adapt the technology for specific applications. Our analysis examines how different regulatory measures, targeting different parts of the AI development chain, affect the outcome of this game. In particular, we assume AI technology is characterized by two key attributes: safety and performance. The regulator first sets a minimum safety standard that applies to one or both players, with strict penalties for non-compliance. The general-purpose creator then invests in the technology, establishing its initial safety and performance levels. Next, domain specialists refine the AI for their specific use cases, updating the safety and performance levels and taking the product to market. The resulting revenue is then distributed between the specialist and generalist through a revenue-sharing parameter. Our analysis reveals two key insights: First, weak safety regulation imposed predominantly on domain specialists can backfire. While it might seem logical to regulate AI use cases, our analysis shows that weak regulations targeting domain specialists alone can unintentionally reduce safety. This effect persists across a wide range of settings. Second, in sharp contrast to the previous finding, we observe that stronger, well-placed regulation can in fact mutually benefit all players subjected to it. When regulators impose appropriate safety standards on both general-purpose AI creators and domain specialists, the regulation functions as a commitment device, leading to safety and performance gains, surpassing what is achieved under no regulation or regulating one player alone.

arxiv情報

著者	Benjamin Laufer,Jon Kleinberg,Hoda Heidari
発行日	2025-06-17 15:26:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

The Backfiring Effect of Weak AI Safety Regulation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー