jarxiv | Japanese arxiv | ページ 614

Uncovering the Limitations of Model Inversion Evaluation: Benchmarks and Connection to Type-I Adversarial Attacks

投稿日: 2025年5月7日作成者: jarxiv

要約

モデルの反転（MI）攻撃は、機械学習モデルへのアクセスを活用することにより、プライベートトレーニングデータの情報を再構築することを目的としています。
MI攻撃/防御の最も一般的な評価フレームワークは、近年提案されているほぼすべてのMI攻撃と防御にわたって進捗を評価するために利用されている評価モデルに依存しています。
この論文では、初めて、MI評価の詳細な研究を提示します。
第一に、MI攻撃サンプルの最初の包括的なヒトが解放されたデータセットを構築し、28の異なるMI攻撃、防御、プライベートおよびパブリックデータセットのセットアップに基づいて構築します。
第二に、データセットを使用して、MI評価フレームワークの精度を調べ、かなりの数の誤検知に苦しんでいることを明らかにします。
これらの調査結果は、以前に報告されたSOTA MI攻撃の成功率に関する疑問を提起します。
第三に、これらの誤検知、設計制御実験の原因を分析し、MI評価に対するI型敵対的な特徴の驚くべき効果、および敵対的な移動性を発見し、以前に2つの異なる研究領域間の関係を強調します。
私たちの調査結果は、Sota MI攻撃のパフォーマンスが過大評価されており、実際のプライバシー漏れは以前に報告されたよりも大幅に少ないことを示唆しています。
結論として、広く使用されているMI評価フレームワークの重大な制限を強調し、偽陽性率を緩和する方法を提示します。
私たちは、以前の研究では、既存の解決策がなく、タイプIの敵対的攻撃が非常に困難であることを示していることに注意してください。
したがって、以前のMI研究のように、単なるサプリメントではなく、人間の評価を主要なMI評価フレームワークと見なすことを促します。
また、より堅牢で信頼性の高い自動評価フレームワークの開発に関するさらなる作業をお勧めします。

要約(オリジナル)

Model Inversion (MI) attacks aim to reconstruct information of private training data by exploiting access to machine learning models. The most common evaluation framework for MI attacks/defenses relies on an evaluation model that has been utilized to assess progress across almost all MI attacks and defenses proposed in recent years. In this paper, for the first time, we present an in-depth study of MI evaluation. Firstly, we construct the first comprehensive human-annotated dataset of MI attack samples, based on 28 setups of different MI attacks, defenses, private and public datasets. Secondly, using our dataset, we examine the accuracy of the MI evaluation framework and reveal that it suffers from a significant number of false positives. These findings raise questions about the previously reported success rates of SOTA MI attacks. Thirdly, we analyze the causes of these false positives, design controlled experiments, and discover the surprising effect of Type I adversarial features on MI evaluation, as well as adversarial transferability, highlighting a relationship between two previously distinct research areas. Our findings suggest that the performance of SOTA MI attacks has been overestimated, with the actual privacy leakage being significantly less than previously reported. In conclusion, we highlight critical limitations in the widely used MI evaluation framework and present our methods to mitigate false positive rates. We remark that prior research has shown that Type I adversarial attacks are very challenging, with no existing solution. Therefore, we urge to consider human evaluation as a primary MI evaluation framework rather than merely a supplement as in previous MI research. We also encourage further work on developing more robust and reliable automatic evaluation frameworks.

arxiv情報

著者	Sy-Tuyen Ho,Koh Jun Hao,Ngoc-Bao Nguyen,Alexander Binder,Ngai-Man Cheung
発行日	2025-05-06 13:32:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Causal Intervention Framework for Variational Auto Encoder Mechanistic Interpretability

投稿日: 2025年5月7日作成者: jarxiv

要約

深い学習モデルの機械的解釈可能性は、ニューラルネットワークの機能を理解するための重要な研究方向として浮上しています。
トランスのような識別モデルを解釈する際には大きな進歩が遂げられていますが、変分自動エンコーダー（VAE）などの生成モデルを理解することは依然として困難です。
このペーパーでは、VAEの機械的解釈可能性のための包括的な因果介入フレームワークを紹介します。
ネットワークレイヤーを介してエンコード、処理、および解放されたセマンティック要因を調べるために、VAESの「回路モチーフ」を識別および分析する手法を開発します。
私たちのアプローチでは、入力操作、潜在スペースの摂動、活性化パッチング、因果媒介分析など、さまざまなレベルでターゲットを絞った介入を使用しています。
既知の因果関係と標準的な解き分析ベンチマークを持つ両方の合成データセットにフレームワークを適用します。
結果は、私たちの介入が機能的回路をうまく分離し、計算グラフをセマンティック因子の因果グラフにマッピングすること、および多体的なユニットとモノセマンティックユニットを区別できることを示しています。
さらに、VAE成分の解釈可能性を定量化する因果効果の強度、介入特異性、および回路のモジュール性のメトリックを紹介します。
実験結果は、標準VAE（0.064、3.99）およびベータヴェ（0.051、3.43）と比較して、因子をより高い解角スコア（0.084）と効果強度（平均4.59）を達成するVAEバリアント間の明確な違いを示しています。
私たちのフレームワークは、生成モデルの機構的理解を進め、より透明で制御可能なVAEアーキテクチャのためのツールを提供します。

要約(オリジナル)

Mechanistic interpretability of deep learning models has emerged as a crucial research direction for understanding the functioning of neural networks. While significant progress has been made in interpreting discriminative models like transformers, understanding generative models such as Variational Autoencoders (VAEs) remains challenging. This paper introduces a comprehensive causal intervention framework for mechanistic interpretability of VAEs. We develop techniques to identify and analyze ‘circuit motifs’ in VAEs, examining how semantic factors are encoded, processed, and disentangled through the network layers. Our approach uses targeted interventions at different levels: input manipulations, latent space perturbations, activation patching, and causal mediation analysis. We apply our framework to both synthetic datasets with known causal relationships and standard disentanglement benchmarks. Results show that our interventions can successfully isolate functional circuits, map computational graphs to causal graphs of semantic factors, and distinguish between polysemantic and monosemantic units. Furthermore, we introduce metrics for causal effect strength, intervention specificity, and circuit modularity that quantify the interpretability of VAE components. Experimental results demonstrate clear differences between VAE variants, with FactorVAE achieving higher disentanglement scores (0.084) and effect strengths (mean 4.59) compared to standard VAE (0.064, 3.99) and Beta-VAE (0.051, 3.43). Our framework advances the mechanistic understanding of generative models and provides tools for more transparent and controllable VAE architectures.

arxiv情報

著者	Dip Roy
発行日	2025-05-06 13:40:59+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Small-Scale-Fading-Aware Resource Allocation in Wireless Federated Learning

投稿日: 2025年5月7日作成者: jarxiv

要約

賢明なリソースの割り当ては、システムと統計的不均一性の両方に対処することにより、ワイヤレスネットワークでのフェデレーション学習（FL）トレーニングパフォーマンスを効果的に強化できます。
ただし、既存の戦略は通常、ブロックフェージングの仮定に依存しており、FLグラデーションアップロードの各ラウンド内の急速なチャネルの変動を見落としており、FLトレーニングパフォーマンスの低下につながります。
したがって、このペーパーでは、マルチエージェント強化学習（MARL）フレームワークを使用して、小規模に対応するリソース割り当て戦略を提案しています。
具体的には、FLアルゴリズムのワンステップ収束境界を確立し、リソース割り当ての問題を分散化された部分的に観察可能なマルコフ決定プロセス（DEC-POMDP）として定式化します。これは、QMIXアルゴリズムを使用して解決されます。
私たちのフレームワークでは、各クライアントは、各コヒーレンスタイムスロット内のスペクトルと電力の割り当てを動的に決定するエージェントとして機能します。これは、局所的な観測と収束分析から得られた報酬に基づいています。
MARLの設定により、アクション空間の次元が減少し、分散化された意思決定が促進され、ソリューションのスケーラビリティと実用性が向上します。
実験結果は、QMIXベースのリソース割り当て戦略が、さまざまな程度の統計的不均一性にわたってベースラインメソッドを大幅に上回ることを示しています。
さらに、アブレーション研究は、小規模なフェードダイナミクスを組み込むことの重要な重要性を検証し、FLパフォーマンスの最適化におけるその役割を強調しています。

要約(オリジナル)

Judicious resource allocation can effectively enhance federated learning (FL) training performance in wireless networks by addressing both system and statistical heterogeneity. However, existing strategies typically rely on block fading assumptions, which overlooks rapid channel fluctuations within each round of FL gradient uploading, leading to a degradation in FL training performance. Therefore, this paper proposes a small-scale-fading-aware resource allocation strategy using a multi-agent reinforcement learning (MARL) framework. Specifically, we establish a one-step convergence bound of the FL algorithm and formulate the resource allocation problem as a decentralized partially observable Markov decision process (Dec-POMDP), which is subsequently solved using the QMIX algorithm. In our framework, each client serves as an agent that dynamically determines spectrum and power allocations within each coherence time slot, based on local observations and a reward derived from the convergence analysis. The MARL setting reduces the dimensionality of the action space and facilitates decentralized decision-making, enhancing the scalability and practicality of the solution. Experimental results demonstrate that our QMIX-based resource allocation strategy significantly outperforms baseline methods across various degrees of statistical heterogeneity. Additionally, ablation studies validate the critical importance of incorporating small-scale fading dynamics, highlighting its role in optimizing FL performance.

arxiv情報

著者	Jiacheng Wang,Le Liang,Hao Ye,Chongtao Guo,Shi Jin
発行日	2025-05-06 13:41:59+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

Nonparametric IPSS: Fast, flexible feature selection with false discovery control

投稿日: 2025年5月7日作成者: jarxiv

要約

機能の選択は、機械学習と統計における重要なタスクです。
ただし、既存の特徴選択方法のいずれかです。（i）線形または一般化された線形モデルなどのパラメトリックメソッドに依存している、（ii）理論的な誤検出制御がない、または（iii）真の陽性をほとんど識別しない。
ここでは、積分パス安定性選択（IPS）を任意の機能の重要性スコアに適用することに基づいて、有限サンプルの誤検出制御を備えた一般的な機能選択方法を紹介します。
この方法は、重要なスコアがノンパラメトリックである場合はいつでもノンパラメトリックであり、p値よりも高次元データにより適したQ値を推定します。
Gradient Boosting（IPSSGB）およびランダムフォレスト（IPSSRF）の重要性スコアを使用して、2つの特別なケースに焦点を当てています。
RNAシーケンスデータを使用した広範な非線形シミュレーションは、両方の方法が誤検出率を正確に制御し、既存の方法よりも多くの真の陽性を検出することを示しています。
どちらの方法も効率的で、500のサンプルと5000の機能がある場合に20秒以内に実行されます。
IPSSGBとIPSSRFを適用して、癌に関連するマイクロRNAと遺伝子を検出し、既存のアプローチよりも少ない特徴でより良い予測をもたらすことがわかりました。

要約(オリジナル)

Feature selection is a critical task in machine learning and statistics. However, existing feature selection methods either (i) rely on parametric methods such as linear or generalized linear models, (ii) lack theoretical false discovery control, or (iii) identify few true positives. Here, we introduce a general feature selection method with finite-sample false discovery control based on applying integrated path stability selection (IPSS) to arbitrary feature importance scores. The method is nonparametric whenever the importance scores are nonparametric, and it estimates q-values, which are better suited to high-dimensional data than p-values. We focus on two special cases using importance scores from gradient boosting (IPSSGB) and random forests (IPSSRF). Extensive nonlinear simulations with RNA sequencing data show that both methods accurately control the false discovery rate and detect more true positives than existing methods. Both methods are also efficient, running in under 20 seconds when there are 500 samples and 5000 features. We apply IPSSGB and IPSSRF to detect microRNAs and genes related to cancer, finding that they yield better predictions with fewer features than existing approaches.

arxiv情報

著者	Omar Melikechi,David B. Dunson,Jeffrey W. Miller
発行日	2025-05-06 14:02:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, stat.AP, stat.ME, stat.ML | コメントを受け付けていません

Efficient Training of Physics-enhanced Neural ODEs via Direct Collocation and Nonlinear Programming

投稿日: 2025年5月7日作成者: jarxiv

要約

トレーニングプロセスを動的最適化問題として表現することにより、物理学強化ニューラルODE（ペノード）をトレーニングするための新しいアプローチを提案します。
ニューラル成分を含む完全なモデルは、Flipped Legendre-Gauss-Radauポイントを備えた高次の暗黙的なRunge-Kuttaメソッドを使用して離散化され、IPOPTなどの最先端のNLPソルバーによって効率的に解決される大規模な非線形プログラム（NLP）が効率的に解決されます。
この定式化により、ネットワークパラメーターと状態軌跡の同時最適化が可能になり、安定性、ランタイム、および精度の観点からODEソルバーベースのトレーニングの重要な制限に対処します。
ニューラルODの最近の直接的なコロケーションベースの方法に拡張して、ペノデに一般化し、物理的制約を組み込み、類似の並列化されたオープンソースの実装を提示します。
4分の1の車両モデルのベンチマークとVan-Der-Pol発振器は、他のトレーニング技術と比較して、より小さなネットワークを使用した優れた精度、速度、および一般化を実証します。
また、OpenModelicaへの計画された統合の概要を説明し、ニューラルDAEのアクセス可能なトレーニングを可能にします。

要約(オリジナル)

We propose a novel approach for training Physics-enhanced Neural ODEs (PeNODEs) by expressing the training process as a dynamic optimization problem. The full model, including neural components, is discretized using a high-order implicit Runge-Kutta method with flipped Legendre-Gauss-Radau points, resulting in a large-scale nonlinear program (NLP) efficiently solved by state-of-the-art NLP solvers such as Ipopt. This formulation enables simultaneous optimization of network parameters and state trajectories, addressing key limitations of ODE solver-based training in terms of stability, runtime, and accuracy. Extending on a recent direct collocation-based method for Neural ODEs, we generalize to PeNODEs, incorporate physical constraints, and present a custom, parallelized, open-source implementation. Benchmarks on a Quarter Vehicle Model and a Van-der-Pol oscillator demonstrate superior accuracy, speed, and generalization with smaller networks compared to other training techniques. We also outline a planned integration into OpenModelica to enable accessible training of Neural DAEs.

arxiv情報

著者	Linus Langenkamp,Philip Hannebohm,Bernhard Bachmann
発行日	2025-05-06 14:04:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: 68T05, 90C30, cs.LG, G.1.6, math.DS, math.OC | コメントを受け付けていません

MARIOH: Multiplicity-Aware Hypergraph Reconstruction

投稿日: 2025年5月7日作成者: jarxiv

要約

ハイパーグラフは、従来のペアワイズグラフが完全にキャプチャできない高次の相互作用をモデル化するための強力なフレームワークを提供します。
ただし、実際的な制約により、予測されたグラフへの単純化につながることが多く、高次の関係を表すことで大きな情報の損失とあいまいさをもたらします。
この作業では、マリオを提案します。マリオは、エッジの多重度を活用することにより、予測グラフから元のハイパーグラフを再構築するための監視されたアプローチを提案します。
大きな検索スペースによってもたらされる困難を克服するために、マリオはいくつかの重要なアイデアを統合します。（a）候補者の検索スペースを削減する証明可能なサイズ2ハイペラッジを識別し、（b）構造的および多重性関連の機能の両方を利用することで候補者がハイペラッジである可能性を予測します。
一緒に、これらのアイデアにより、マリオは検索スペースを効率的かつ効果的に探索できます。
10の実際のデータセットを使用した実験では、Mariohは最先端の方法と比較して最大74.51％の再構成精度を達成しています。

要約(オリジナル)

Hypergraphs offer a powerful framework for modeling higher-order interactions that traditional pairwise graphs cannot fully capture. However, practical constraints often lead to their simplification into projected graphs, resulting in substantial information loss and ambiguity in representing higher-order relationships. In this work, we propose MARIOH, a supervised approach for reconstructing the original hypergraph from its projected graph by leveraging edge multiplicity. To overcome the difficulties posed by the large search space, MARIOH integrates several key ideas: (a) identifying provable size-2 hyperedges, which reduces the candidate search space, (b) predicting the likelihood of candidates being hyperedges by utilizing both structural and multiplicity-related features, and (c) not only targeting promising hyperedge candidates but also examining less confident ones to explore alternative possibilities. Together, these ideas enable MARIOH to efficiently and effectively explore the search space. In our experiments using 10 real-world datasets, MARIOH achieves up to 74.51% higher reconstruction accuracy compared to state-of-the-art methods.

arxiv情報

著者	Kyuhan Lee,Geon Lee,Kijung Shin
発行日	2025-05-06 14:22:16+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.DB, cs.LG, H.2.8 | コメントを受け付けていません

Don’t Mesh with Me: Generating Constructive Solid Geometry Instead of Meshes by Fine-Tuning a Code-Generation LLM

投稿日: 2025年5月7日作成者: jarxiv

要約

LLMSなどの機械学習の最近の進歩は、ソフトウェア開発とクリエイティブ業界に革命をもたらしていますが、機械部品の設計エンジニアには最小限の影響を与えています。これは、主に手動プロセスのままです。
3Dジオメトリを生成するための既存のアプローチは、最も一般的にメッシュを3D表現として使用します。
メッシュはビデオゲームやアニメーションの資産に適していますが、機械工学の目的では十分な精度と適応性がありません。
このペーパーでは、コードジェネレーションLLMを活用することにより、表面ベースの建設的固体幾何学（CSG）を生成する3Dジオメトリの生成に関する新しいアプローチを紹介します。
最初に、境界表現ジオメトリ（BREP）をCSGベースのPythonスクリプトに変換することにより、コードスクリプトとして表される3Dメカニカルパーツのデータセットを作成します。
第二に、GPT-4を使用して自然言語で注釈を作成します。
結果のデータセットは、コードジェネレーションLLMを微調整するために使用されます。
微調整されたLLMは、位置入力と自然言語に基づいてもっともらしい方法で幾何学を完了し、幾何学的理解を実証することができます。

要約(オリジナル)

While recent advancements in machine learning, such as LLMs, are revolutionizing software development and creative industries, they have had minimal impact on engineers designing mechanical parts, which remains largely a manual process. Existing approaches to generating 3D geometry most commonly use meshes as a 3D representation. While meshes are suitable for assets in video games or animations, they lack sufficient precision and adaptability for mechanical engineering purposes. This paper introduces a novel approach for the generation of 3D geometry that generates surface-based Constructive Solid Geometry (CSG) by leveraging a code-generation LLM. First, we create a dataset of 3D mechanical parts represented as code scripts by converting Boundary Representation geometry (BREP) into CSG-based Python scripts. Second, we create annotations in natural language using GPT-4. The resulting dataset is used to fine-tune a code-generation LLM. The fine-tuned LLM can complete geometries based on positional input and natural language in a plausible way, demonstrating geometric understanding.

arxiv情報

著者	Maximilian Mews,Ansar Aynetdinov,Vivian Schiller,Peter Eisert,Alan Akbik
発行日	2025-05-06 14:25:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.GR, cs.LG | コメントを受け付けていません

Decision Making under Model Misspecification: DRO with Robust Bayesian Ambiguity Sets

投稿日: 2025年5月7日作成者: jarxiv

要約

分配的に堅牢な最適化（DRO）は、経験的分布またはモデルに基づいた曖昧さセット内の最悪のリスクを考慮することにより、リスク回避の意思決定者を保護します。
騒々しいデータをさらに守るために、モデルベースのアプローチは、事後から意思決定の問題まで不確実性を伝播するベイジアン製剤を認めます。
ただし、モデルが誤って指定されている場合、意思決定者は、データ生成プロセス（DGP）を含むように設定されたあいまいさを伸ばす必要があり、過度に保守的な決定につながります。
この課題に、DROを堅牢で導入することにより、誤った誤解をモデル化するベイジアンのあいまいさセット（DRO-Robas）をモデル化することにより、この課題に対処します。
これらは、DGPに関する信念を組み込んだ堅牢な後方予測分布を中心とした最大平均矛盾のあいまいさセットです。
結果として得られる最適化問題は、繁殖するカーネルヒルベルト空間で二重の定式化を取得し、あいまいさセットの許容レベルについて確率的保証を提供することを示します。
私たちの方法は、ニュースベンダーとポートフォリオの問題でサンプル外のパフォーマンスで他のベイジアンおよび経験的DROアプローチを上回り、モデルの誤りのさまざまなケースを備えています。

要約(オリジナル)

Distributionally Robust Optimisation (DRO) protects risk-averse decision-makers by considering the worst-case risk within an ambiguity set of distributions based on the empirical distribution or a model. To further guard against finite, noisy data, model-based approaches admit Bayesian formulations that propagate uncertainty from the posterior to the decision-making problem. However, when the model is misspecified, the decision maker must stretch the ambiguity set to contain the data-generating process (DGP), leading to overly conservative decisions. We address this challenge by introducing DRO with Robust, to model misspecification, Bayesian Ambiguity Sets (DRO-RoBAS). These are Maximum Mean Discrepancy ambiguity sets centred at a robust posterior predictive distribution that incorporates beliefs about the DGP. We show that the resulting optimisation problem obtains a dual formulation in the Reproducing Kernel Hilbert Space and we give probabilistic guarantees on the tolerance level of the ambiguity set. Our method outperforms other Bayesian and empirical DRO approaches in out-of-sample performance on the Newsvendor and Portfolio problems with various cases of model misspecification.

arxiv情報

著者	Charita Dellaporta,Patrick O’Hara,Theodoros Damoulas
発行日	2025-05-06 14:46:16+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, stat.ML | コメントを受け付けていません

Physics-Informed Sylvester Normalizing Flows for Bayesian Inference in Magnetic Resonance Spectroscopy

投稿日: 2025年5月7日作成者: jarxiv

要約

磁気共鳴分光法（MRS）は、組織の代謝組成を測定するための非侵襲的手法であり、神経障害、腫瘍検出、およびその他の代謝機能障害に関する貴重な洞察を提供します。
ただし、正確な代謝物の定量化は、スペクトルオーバーラップ、低信号対雑音比、さまざまなアーティファクトなどの課題によって妨げられています。
線形結合モデリングのような従来の方法は、あいまいさの影響を受けやすく、一般に、cram \ ‘er-rao結合の形での推定精度に関する理論的下限のみを提供します。
この作業では、シルベスター正規化フロー（SNF）を使用してベイジアン推論フレームワークを導入し、代謝産物濃度よりも後部分布を近似し、定量化の信頼性を高めます。
物理ベースのデコーダーには、MRS信号形成の事前知識が組み込まれ、現実的な分布表現が確保されます。
シミュレートされた7TプロトンMRSデータの方法を検証し、正確な代謝物の定量化、十分に調整された不確実性、およびパラメーター相関とマルチモーダル分布に関する洞察を実証します。

要約(オリジナル)

Magnetic resonance spectroscopy (MRS) is a non-invasive technique to measure the metabolic composition of tissues, offering valuable insights into neurological disorders, tumor detection, and other metabolic dysfunctions. However, accurate metabolite quantification is hindered by challenges such as spectral overlap, low signal-to-noise ratio, and various artifacts. Traditional methods like linear-combination modeling are susceptible to ambiguities and commonly only provide a theoretical lower bound on estimation accuracy in the form of the Cram\’er-Rao bound. This work introduces a Bayesian inference framework using Sylvester normalizing flows (SNFs) to approximate posterior distributions over metabolite concentrations, enhancing quantification reliability. A physics-based decoder incorporates prior knowledge of MRS signal formation, ensuring realistic distribution representations. We validate the method on simulated 7T proton MRS data, demonstrating accurate metabolite quantification, well-calibrated uncertainties, and insights into parameter correlations and multi-modal distributions.

arxiv情報

著者	Julian P. Merkofer,Dennis M. J. van de Sande,Alex A. Bhogal,Ruud J. G. van Sloun
発行日	2025-05-06 14:50:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG, eess.SP, stat.ML | コメントを受け付けていません

Anant-Net: Breaking the Curse of Dimensionality with Scalable and Interpretable Neural Surrogate for High-Dimensional PDEs

投稿日: 2025年5月7日作成者: jarxiv

要約

高次元の部分微分方程式（PDE）は、多様な科学的および工学的アプリケーションで発生しますが、次元の呪いのために計算上扱いにくいままです。
従来の数値的手法は、特に必要なコロケーションポイントの数が次元とともに急速に増加する高圧ドメインで、計算の複雑さの指数関数的な成長と闘っています。
ここでは、この課題を克服する効率的なニューラル代理であるAnant-Netを紹介し、高次元でのPDEの解を可能にします。
寸法が増加するにつれて内部体積が減少する過球とは異なり、ハイパーキューブは体積を保持または拡張し（単位以上の長さ）、高次元計算を非常に厳しくします。
ANANT-NETは、高次元の境界条件を効率的に組み込み、高次元のコロケーションポイントでのPDE残差を最小限に抑えます。
解釈可能性を高めるために、Kolmogorov-ArnoldネットワークをAnant-Netアーキテクチャに統合します。
Poisson、Sine-Gordon、およびAllen-Cahn方程式を含むいくつかの線形および非線形の高次元方程式に関するAnant-Netのパフォーマンスをベンチマークし、高次元空間からランダムにサンプリングされたテストポイント全体で高い精度と堅牢性を示しています。
重要なことに、Anant-Netはこれらの結果を驚くほど効率的に達成し、数時間以内に単一のGPUで300次元の問題を解決します。
また、Anant-Netの結果を精度とランタイムと比較して、他の最先端の方法と比較します。
私たちの調査結果は、高次元のPDEを効率的に解くための正確で解釈可能なスケーラブルなフレームワークとしてAnant-Netを確立しています。

要約(オリジナル)

High-dimensional partial differential equations (PDEs) arise in diverse scientific and engineering applications but remain computationally intractable due to the curse of dimensionality. Traditional numerical methods struggle with the exponential growth in computational complexity, particularly on hypercubic domains, where the number of required collocation points increases rapidly with dimensionality. Here, we introduce Anant-Net, an efficient neural surrogate that overcomes this challenge, enabling the solution of PDEs in high dimensions. Unlike hyperspheres, where the internal volume diminishes as dimensionality increases, hypercubes retain or expand their volume (for unit or larger length), making high-dimensional computations significantly more demanding. Anant-Net efficiently incorporates high-dimensional boundary conditions and minimizes the PDE residual at high-dimensional collocation points. To enhance interpretability, we integrate Kolmogorov-Arnold networks into the Anant-Net architecture. We benchmark Anant-Net’s performance on several linear and nonlinear high-dimensional equations, including the Poisson, Sine-Gordon, and Allen-Cahn equations, demonstrating high accuracy and robustness across randomly sampled test points from high-dimensional space. Importantly, Anant-Net achieves these results with remarkable efficiency, solving 300-dimensional problems on a single GPU within a few hours. We also compare Anant-Net’s results for accuracy and runtime with other state-of-the-art methods. Our findings establish Anant-Net as an accurate, interpretable, and scalable framework for efficiently solving high-dimensional PDEs.

arxiv情報

著者	Sidharth S. Menon,Ameya D. Jagtap
発行日	2025-05-06 14:56:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.LG | コメントを受け付けていません

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント