jarxiv | Japanese arxiv | ページ 758

Repurposing the scientific literature with vision-language models

投稿日: 2025年4月28日作成者: jarxiv

要約

主要なビジョン言語モデル（VLM）は、一般的なインターネットコンテンツについてトレーニングされており、科学雑誌の豊かなドメイン固有の知識を見落としています。
専門分野の文献に関するトレーニングは、高性能のタスク固有のツールを生み出し、生成的AIが専門出版、教育、および臨床タスクのジェネラリストモデルと一致する可能性があります。
Neuropubsを作成しました。これは、23,000のNeurosurgery Publicationsの記事（134mの単語、78kの画像キャプションペア）のマルチモーダルデータセットを作成しました。
NeuroPubsを使用して、VLMSは出版対象のグラフィカルな要約（100の要約の70％）と、人間が書いたものと区別できないボードスタイルの質問（89,587の質問の54％）を生成しました。
これらの質問を使用して、34B-Parameter VLMであるCNS-Obsidianを訓練しました。
盲検化されたランダム化比較試験では、我々のモデルは、神経外科的鑑別診断における当時の最先端のGPT-4O（臨床的有用性、40.62％のUpvotes対57.89％、P = 0.1150;精度、59.38％対65.79％、P = 0.3797）の非劣性を示しました。
私たちのパイロット研究では、特殊なジャーナルコンテンツのトレーニング生成AIモデル – 大規模なインターネットデータなしでは、高性能のアカデミックおよび臨床ツールをもたらし、多様な分野でドメインに誘導されたAIを可能にします。

要約(オリジナル)

Leading vision-language models (VLMs) are trained on general Internet content, overlooking scientific journals’ rich, domain-specific knowledge. Training on specialty-specific literature could yield high-performance, task-specific tools, enabling generative AI to match generalist models in specialty publishing, educational, and clinical tasks. We created NeuroPubs, a multimodal dataset of 23,000 Neurosurgery Publications articles (134M words, 78K image-caption pairs). Using NeuroPubs, VLMs generated publication-ready graphical abstracts (70% of 100 abstracts) and board-style questions indistinguishable from human-written ones (54% of 89,587 questions). We used these questions to train CNS-Obsidian, a 34B-parameter VLM. In a blinded, randomized controlled trial, our model demonstrated non-inferiority to then state-of-the-art GPT-4o in neurosurgical differential diagnosis (clinical utility, 40.62% upvotes vs. 57.89%, p=0.1150; accuracy, 59.38% vs. 65.79%, p=0.3797). Our pilot study demonstrates how training generative AI models on specialty-specific journal content – without large-scale internet data – results in high-performance academic and clinical tools, enabling domain-tailored AI across diverse fields.

arxiv情報

著者	Anton Alyakin,Jaden Stryker,Daniel Alexander Alber,Karl L. Sangwon,Jin Vivian Lee,Brandon Duderstadt,Akshay Save,David Kurland,Spencer Frome,Shrutika Singh,Jeff Zhang,Eunice Yang,Ki Yun Park,Cordelia Orillac,Aly A. Valliani,Sean Neifert,Albert Liu,Aneek Patel,Christopher Livia,Darryl Lau,Ilya Laufer,Peter A. Rozman,Eveline Teresa Hidalgo,Howard Riina,Rui Feng,Todd Hollon,Yindalon Aphinyanaphongs,John G. Golfinos,Laura Snyder,Eric Leuthardt,Douglas Kondziolka,Eric Karl Oermann
発行日	2025-04-25 13:29:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL, cs.HC | コメントを受け付けていません

Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review

投稿日: 2025年4月28日作成者: jarxiv

要約

大規模な言語モデル（LLM）は、多くのドメインで変換されています。
ただし、幻覚 – 自信を持って誤った情報を出力する – は、LLMの主要な課題の1つであり続けています。
これは、LLMSの不確実性を正確に評価および定量化する方法の問題を提起します。
従来のモデルに関する広範な文献では、不確実性を測定するための不確実性の定量化（UQ）を調査し、不確実性と精度の不整合に対処するためのキャリブレーション手法を採用しました。
これらの方法のいくつかはLLMに適合していますが、文献にはその有効性の詳細な分析が欠けており、既存のソリューション間の洞察に富んだ比較を可能にする包括的なベンチマークを提供しません。
この作業では、LLMSのUQとキャリブレーションに関する代表的な以前の作業の体系的な調査を介してこのギャップを埋め、厳格なベンチマークを導入します。
広く使用されている2つの信頼性データセットを使用して、6つの関連する方法を経験的に評価します。これは、レビューの重要な調査結果を正当化します。
最後に、将来の主要な方向性の見通しを提供し、オープンな課題の概要を説明します。
私たちの知る限り、この調査は、LLMSのキャリブレーション方法と関連するメトリックをレビューした最初の専用研究です。

要約(オリジナル)

Large Language Models (LLMs) have been transformative across many domains. However, hallucination — confidently outputting incorrect information — remains one of the leading challenges for LLMs. This raises the question of how to accurately assess and quantify the uncertainty of LLMs. Extensive literature on traditional models has explored Uncertainty Quantification (UQ) to measure uncertainty and employed calibration techniques to address the misalignment between uncertainty and accuracy. While some of these methods have been adapted for LLMs, the literature lacks an in-depth analysis of their effectiveness and does not offer a comprehensive benchmark to enable insightful comparison among existing solutions. In this work, we fill this gap via a systematic survey of representative prior works on UQ and calibration for LLMs and introduce a rigorous benchmark. Using two widely used reliability datasets, we empirically evaluate six related methods, which justify the significant findings of our review. Finally, we provide outlooks for key future directions and outline open challenges. To the best of our knowledge, this survey is the first dedicated study to review the calibration methods and relevant metrics for LLMs.

arxiv情報

著者	Toghrul Abbasli,Kentaroh Toyoda,Yuan Wang,Leon Witt,Muhammad Asif Ali,Yukai Miao,Dan Li,Qingsong Wei
発行日	2025-04-25 13:34:40+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

The Rise of Small Language Models in Healthcare: A Comprehensive Survey

投稿日: 2025年4月28日作成者: jarxiv

要約

大規模な言語モデル（LLM）によって推進されるヘルスケアアプリケーションの大幅な進展にもかかわらず、データプライバシーに関する懸念の高まり、および限られたリソース。
小言語モデル（SLM）は、次世代のヘルスケア情報学のためのリソース制約の環境で効率的なパフォーマンスのためのスケーラブルで臨床的に実行可能なソリューションを提供します。
当社の包括的な調査では、医療専門家と情報提供者のためにそれらを特定して分類するための分類学的枠組みを提示します。
ヘルスケアSLMの貢献のタイムラインは、NLPタスク、利害関係者の役割、および継続的なケアの3つの次元にわたってモデルを分析するための基礎的な枠組みを確立します。
ゼロからモデルを構築するための建築基盤を特定するための分類枠組みを提示します。
プロンプト、指導の微調整、および推論を通じて、SLMを臨床的精度に適応させます。
圧縮技術によるアクセシビリティと持続可能性。
私たちの主な目的は、医療専門家に包括的な調査を提供し、モデルの最適化に最近の革新を導入し、その分野での将来の研究開発をサポートするためにキュレーションされたリソースを装備することです。
ヘルスケアのSLMSの画期的な進歩を紹介することを目指して、ヘルスケアで広く研究されているNLPタスクにわたって実験結果を包括的に編集して、ヘルスケアにおけるSLMの変革の可能性を強調します。
更新されたリポジトリはGitHubで入手できます

要約(オリジナル)

Despite substantial progress in healthcare applications driven by large language models (LLMs), growing concerns around data privacy, and limited resources; the small language models (SLMs) offer a scalable and clinically viable solution for efficient performance in resource-constrained environments for next-generation healthcare informatics. Our comprehensive survey presents a taxonomic framework to identify and categorize them for healthcare professionals and informaticians. The timeline of healthcare SLM contributions establishes a foundational framework for analyzing models across three dimensions: NLP tasks, stakeholder roles, and the continuum of care. We present a taxonomic framework to identify the architectural foundations for building models from scratch; adapting SLMs to clinical precision through prompting, instruction fine-tuning, and reasoning; and accessibility and sustainability through compression techniques. Our primary objective is to offer a comprehensive survey for healthcare professionals, introducing recent innovations in model optimization and equipping them with curated resources to support future research and development in the field. Aiming to showcase the groundbreaking advancements in SLMs for healthcare, we present a comprehensive compilation of experimental results across widely studied NLP tasks in healthcare to highlight the transformative potential of SLMs in healthcare. The updated repository is available at Github

arxiv情報

著者	Muskan Garg,Shaina Raza,Shebuti Rayana,Xingyi Liu,Sunghwan Sohn
発行日	2025-04-25 13:42:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Testing Individual Fairness in Graph Neural Networks

投稿日: 2025年4月28日作成者: jarxiv

要約

人工知能（AI）モデルのバイアスは、性別や人種などの機密特性に基づいてグループや個人を区別する自動意思決定プロセスにつながる可能性があります。
さまざまなAIモデルのバイアスの診断と緩和に関する多くの研究がありますが、グラフニューラルネットワーク（GNNS）の個々の公平性に関する研究はほとんどありません。
データ機能を独立して扱い、相互関係を見落とす従来のモデルとは異なり、GNNはノードが相互接続されているグラフベースの構造をキャプチャするように設計されています。
このリレーショナルアプローチにより、GNNは複雑な依存関係をモデル化できますが、バイアスがこれらの接続を通じて伝播し、個々の公平性違反の検出と緩和を複雑にすることも意味します。
このPhDプロジェクトは、GNNの個々の公平性を評価および確保するためのテストフレームワークを開発することを目的としています。
最初に、個々の公平性に関する文献を体系的にレビューし、既存のアプローチを分類してモデルバイアスを定義、測定、テスト、および軽減し、個々の公平性の分類を作成します。
次に、プロジェクトは、現在の公平性テストと緩和手法を適応させ、拡張することにより、GNNの公平性をテストおよび確保するためのフレームワークを開発します。
フレームワークは、グラフベースの大手言語モデルに焦点を当てた産業ケーススタディを通じて評価されます。

要約(オリジナル)

The biases in artificial intelligence (AI) models can lead to automated decision-making processes that discriminate against groups and/or individuals based on sensitive properties such as gender and race. While there are many studies on diagnosing and mitigating biases in various AI models, there is little research on individual fairness in Graph Neural Networks (GNNs). Unlike traditional models, which treat data features independently and overlook their inter-relationships, GNNs are designed to capture graph-based structure where nodes are interconnected. This relational approach enables GNNs to model complex dependencies, but it also means that biases can propagate through these connections, complicating the detection and mitigation of individual fairness violations. This PhD project aims to develop a testing framework to assess and ensure individual fairness in GNNs. It first systematically reviews the literature on individual fairness, categorizing existing approaches to define, measure, test, and mitigate model biases, creating a taxonomy of individual fairness. Next, the project will develop a framework for testing and ensuring fairness in GNNs by adapting and extending current fairness testing and mitigation techniques. The framework will be evaluated through industrial case studies, focusing on graph-based large language models.

arxiv情報

著者	Roya Nasiri
発行日	2025-04-25 13:45:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

CR-LSO: Convex Neural Architecture Optimization in the Latent Space of Graph Variational Autoencoder with Input Convex Neural Networks

投稿日: 2025年4月28日作成者: jarxiv

要約

潜在空間最適化（LSO）に基づくニューラルアーキテクチャ検索（NAS）メソッドでは、深い生成モデルが、個別のニューラルアーキテクチャを連続潜在空間に埋め込むように訓練されています。
この場合、連続空間で動作するさまざまな最適化アルゴリズムを実装して、ニューラルアーキテクチャを検索できます。
ただし、潜在変数の最適化は、潜在空間からアーキテクチャのパフォーマンスへのマッピングは一般に非凸であるため、勾配ベースのLSOにとって困難です。
この問題に取り組むために、このペーパーでは、凸型アーキテクチャのパフォーマンスマッピングを取得するために、潜在空間の学習プロセスを正規化することを目的とした、凸の正規化潜在スペース最適化（CR-LSO）メソッドを開発します。
具体的には、CR-LSOはグラフ変異オートエンコーダー（G-VAE）をトレーニングして、離散アーキテクチャの連続表現を学習します。
同時に、潜在空間の学習プロセスは、入力凸ニューラルネットワーク（ICNN）の保証された凸性によって正規化されます。
このようにして、G-Vaeは、アーキテクチャ表現からアーキテクチャのパフォーマンスへの凸マッピングを学習せざるを得ません。
以下、CR-LSOはICNNを使用してパフォーマンスマッピングに近似し、推定勾配をレバレッジして神経アーキテクチャ表現を最適化します。
3つの一般的なNASベンチマークでの実験結果は、CR-LSOが計算の複雑さとアーキテクチャパフォーマンスの両方の観点から競争力のある評価結果を達成することを示しています。

要約(オリジナル)

In neural architecture search (NAS) methods based on latent space optimization (LSO), a deep generative model is trained to embed discrete neural architectures into a continuous latent space. In this case, different optimization algorithms that operate in the continuous space can be implemented to search neural architectures. However, the optimization of latent variables is challenging for gradient-based LSO since the mapping from the latent space to the architecture performance is generally non-convex. To tackle this problem, this paper develops a convexity regularized latent space optimization (CR-LSO) method, which aims to regularize the learning process of latent space in order to obtain a convex architecture performance mapping. Specifically, CR-LSO trains a graph variational autoencoder (G-VAE) to learn the continuous representations of discrete architectures. Simultaneously, the learning process of latent space is regularized by the guaranteed convexity of input convex neural networks (ICNNs). In this way, the G-VAE is forced to learn a convex mapping from the architecture representation to the architecture performance. Hereafter, the CR-LSO approximates the performance mapping using the ICNN and leverages the estimated gradient to optimize neural architecture representations. Experimental results on three popular NAS benchmarks show that CR-LSO achieves competitive evaluation results in terms of both computational complexity and architecture performance.

arxiv情報

著者	Xuan Rao,Bo Zhao,Derong Liu
発行日	2025-04-25 14:04:12+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Pushing the boundary on Natural Language Inference

投稿日: 2025年4月28日作成者: jarxiv

要約

自然言語推論（NLI）は、事実チェック、質問の回答、情報の検索におけるアプリケーションを使用した自然言語理解の中心的なタスクです。
その重要性にもかかわらず、現在のNLIシステムは、注釈のアーティファクトとバイアスを含むことが多いデータセットを使用した監視された学習に大きく依存しており、一般化と現実世界の適用性を制限しています。
この作業では、NLIでのチェーン思考（COT）学習のためのグループ相対ポリシー最適化（GRPO）を使用した強化学習ベースのアプローチを適用し、ラベル付きの理論的根拠の必要性を排除し、ANLIなどのより挑戦的なデータセットでこのタイプのトレーニングを可能にします。
パラメーター効率の高い技術（LORAおよびQlora）を使用して、7B、14B、および32B言語モデルを微調整し、標準および敵対的なNLIベンチマーク全体で強力なパフォーマンスを示します。
私たちの32B AWQ定量化されたモデルは、11の敵対的なセットのうち7つの$ \ unicode {x2013} $または22GBのメモリフットプリント内の複製$ \ unicode {x2013} $を考慮して、すべての敵対的なセットのうち7つで最先端の結果を上回ります。
この作業は、推論の品質を犠牲にすることなく、堅牢なNLIシステムを構築するためのスケーラブルで実用的なフレームワークを提供します。

要約(オリジナル)

Natural Language Inference (NLI) is a central task in natural language understanding with applications in fact-checking, question answering, and information retrieval. Despite its importance, current NLI systems heavily rely on supervised learning with datasets that often contain annotation artifacts and biases, limiting generalization and real-world applicability. In this work, we apply a reinforcement learning-based approach using Group Relative Policy Optimization (GRPO) for Chain-of-Thought (CoT) learning in NLI, eliminating the need for labeled rationales and enabling this type of training on more challenging datasets such as ANLI. We fine-tune 7B, 14B, and 32B language models using parameter-efficient techniques (LoRA and QLoRA), demonstrating strong performance across standard and adversarial NLI benchmarks. Our 32B AWQ-quantized model surpasses state-of-the-art results on 7 out of 11 adversarial sets$\unicode{x2013}$or on all of them considering our replication$\unicode{x2013}$within a 22GB memory footprint, showing that robust reasoning can be retained under aggressive quantization. This work provides a scalable and practical framework for building robust NLI systems without sacrificing inference quality.

arxiv情報

著者	Pablo Miralles-González,Javier Huertas-Tato,Alejandro Martín,David Camacho
発行日	2025-04-25 14:20:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Spatial Reasoner: A 3D Inference Pipeline for XR Applications

投稿日: 2025年4月28日作成者: jarxiv

要約

最新の拡張Reality XRシステムは、セマンティックな方法で3Dシーンについて推論できるセンサー入力と需要AR/VRアプリケーションの画像データと融合の豊富な分析を提供します。
3Dオブジェクトが互いにどのように配置されるかを決定するなどの重要なタスク（「オン」、「近く」など）を決定するなどの重要なタスクを処理するための象徴的な述語と関係を橋渡しする空間的推論フレームワークを提示します。
その基礎は、自然言語に関連する形式で表現されるトポロジーや接続性から方向性と方向性に至るまで、包括的な空間述語によって強化された、向きのある3D境界ボックス表現に依存しています。
導出された述語は、空間知識グラフを形成し、パイプラインベースの推論モデルと組み合わせて、空間クエリと動的ルール評価を有効にします。
クライアントおよびサーバー側の処理のための実装は、幾何学的データを実用的な知識に効率的に変換し、複雑な3D環境でのスケーラブルでテクノロジーに依存しない空間的推論を確保するフレームワークの機能を示しています。
空間的推論のフレームワークは、空間オントロジーの作成を促進し、XRアプリケーションで機械学習、自然言語処理、およびルールシステムとシームレスに統合されるため、豊かになります。

要約(オリジナル)

Modern extended reality XR systems provide rich analysis of image data and fusion of sensor input and demand AR/VR applications that can reason about 3D scenes in a semantic manner. We present a spatial reasoning framework that bridges geometric facts with symbolic predicates and relations to handle key tasks such as determining how 3D objects are arranged among each other (‘on’, ‘behind’, ‘near’, etc.). Its foundation relies on oriented 3D bounding box representations, enhanced by a comprehensive set of spatial predicates, ranging from topology and connectivity to directionality and orientation, expressed in a formalism related to natural language. The derived predicates form a spatial knowledge graph and, in combination with a pipeline-based inference model, enable spatial queries and dynamic rule evaluation. Implementations for client- and server-side processing demonstrate the framework’s capability to efficiently translate geometric data into actionable knowledge, ensuring scalable and technology-independent spatial reasoning in complex 3D environments. The Spatial Reasoner framework is fostering the creation of spatial ontologies, and seamlessly integrates with and therefore enriches machine learning, natural language processing, and rule systems in XR applications.

arxiv情報

著者	Steven Häsler,Philipp Ackermann
発行日	2025-04-25 14:27:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.GR, cs.HC, cs.SE, extended reality, knowledge representation, spatial computing, spatial reasoning | コメントを受け付けていません

Bridge the Domains: Large Language Models Enhanced Cross-domain Sequential Recommendation

投稿日: 2025年4月28日作成者: jarxiv

要約

クロスドメイン順次推奨（CDSR）は、さまざまなドメインにわたるユーザーの履歴相互作用からの好みを抽出することを目的としています。
CDSRのいくらかの進歩にもかかわらず、2つの問題がさらなる進歩、つまりジレンマと遷移の複雑さの重複の障壁を設定しました。
前者は、既存のCDSRメソッドが、すべてのドメインで相互作用を所有するユーザーに深刻に依存して、ドメインのアイテム間の関係を学習し、実用性を損なうことを意味します。
後者は、混合行動シーケンスから複雑な遷移パターンを学習することの難しさを指します。
強力な表現と推論能力により、大きな言語モデル（LLM）は、アイテムを橋渡しし、セマンティックビューからユーザーの好みをキャプチャすることにより、これら2つの問題に対処することを約束しています。
したがって、LLMS強化されたクロスドメインシーケンシャル推奨モデル（LLM4CDSR）を提案します。
セマンティックアイテムの関係を取得するために、最初にアイテムを表現するLLMベースの統一表現モジュールを提案します。
次に、CDSRタスクを適応するように設計されています。
また、階層的なLLMSプロファイリングモジュールは、ユーザーのクロスドメインの好みを要約するように設計されています。
最後に、これらの2つのモジュールは、提案されたTri-Threadフレームワークに統合され、推奨事項を導き出します。
LLM4CDSRの有効性を検証して、3つのパブリッククロスドメインデータセットで広範な実験を実施しました。
オンラインでコードをリリースしました。

要約(オリジナル)

Cross-domain Sequential Recommendation (CDSR) aims to extract the preference from the user’s historical interactions across various domains. Despite some progress in CDSR, two problems set the barrier for further advancements, i.e., overlap dilemma and transition complexity. The former means existing CDSR methods severely rely on users who own interactions on all domains to learn cross-domain item relationships, compromising the practicability. The latter refers to the difficulties in learning the complex transition patterns from the mixed behavior sequences. With powerful representation and reasoning abilities, Large Language Models (LLMs) are promising to address these two problems by bridging the items and capturing the user’s preferences from a semantic view. Therefore, we propose an LLMs Enhanced Cross-domain Sequential Recommendation model (LLM4CDSR). To obtain the semantic item relationships, we first propose an LLM-based unified representation module to represent items. Then, a trainable adapter with contrastive regularization is designed to adapt the CDSR task. Besides, a hierarchical LLMs profiling module is designed to summarize user cross-domain preferences. Finally, these two modules are integrated into the proposed tri-thread framework to derive recommendations. We have conducted extensive experiments on three public cross-domain datasets, validating the effectiveness of LLM4CDSR. We have released the code online.

arxiv情報

著者	Qidong Liu,Xiangyu Zhao,Yejing Wang,Zijian Zhang,Howard Zhong,Chong Chen,Xiang Li,Wei Huang,Feng Tian
発行日	2025-04-25 14:30:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.IR | コメントを受け付けていません

Decoding complexity: how machine learning is redefining scientific discovery

投稿日: 2025年4月28日作成者: jarxiv

要約

現代の科学機器が膨大な量のデータを生成し、科学文献の情報量が成長し続けているため、機械学習（ML）は、これらの複雑なデータセットを整理、分析、解釈するための不可欠なツールになりました。
このペーパーでは、さまざまな科学分野にわたるブレークスルーの加速におけるMLの変革的役割について説明します。
脳のマッピングやエクソプラネット検出などの重要な例を提示することにより、MLが科学的研究をどのように再形成しているかを示します。
また、基礎となる現象に関するさまざまなレベルの知識が利用可能なさまざまなシナリオを探り、制限を克服し、MLの可能性を最大限に発揮する戦略を特定します。
その進歩にもかかわらず、MLへの依存度の高まりは、研究アプリケーションの課題と発見の厳密な検証をもたらします。
これらの課題があっても、MLは、研究者がますます複雑な問題に取り組むことができるようにすることで、従来の方法論を混乱させ、知識の境界を前進させる態勢を整えていると主張します。
したがって、科学コミュニティは、必要な伝統的な単純化を超えて自然システムの完全な複雑さを受け入れることができ、最終的には人類の最も差し迫った課題に対する学際的なブレークスルーと革新的な解決策への道を開くことができます。

要約(オリジナル)

As modern scientific instruments generate vast amounts of data and the volume of information in the scientific literature continues to grow, machine learning (ML) has become an essential tool for organising, analysing, and interpreting these complex datasets. This paper explores the transformative role of ML in accelerating breakthroughs across a range of scientific disciplines. By presenting key examples — such as brain mapping and exoplanet detection — we demonstrate how ML is reshaping scientific research. We also explore different scenarios where different levels of knowledge of the underlying phenomenon are available, identifying strategies to overcome limitations and unlock the full potential of ML. Despite its advances, the growing reliance on ML poses challenges for research applications and rigorous validation of discoveries. We argue that even with these challenges, ML is poised to disrupt traditional methodologies and advance the boundaries of knowledge by enabling researchers to tackle increasingly complex problems. Thus, the scientific community can move beyond the necessary traditional oversimplifications to embrace the full complexity of natural systems, ultimately paving the way for interdisciplinary breakthroughs and innovative solutions to humanity’s most pressing challenges.

arxiv情報

著者	Ricardo Vinuesa,Paola Cinnella,Jean Rabault,Hossein Azizpour,Stefan Bauer,Bingni W. Brunton,Arne Elofsson,Elias Jarlebring,Hedvig Kjellstrom,Stefano Markidis,David Marlevi,Javier Garcia-Martinez,Steven L. Brunton
発行日	2025-04-25 14:35:04+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Application of linear regression and quasi-Newton methods to the deep reinforcement learning in continuous action cases

投稿日: 2025年4月28日作成者: jarxiv

要約

線形回帰（LR）メソッドは、最適なパラメーターを比較的簡単に計算できるという利点を提供しますが、その表現能力は深い学習手法よりも制限されています。
深い補強学習を改善するために、Levine et al。によって最小二乗ディープQネットワーク（LS-DQN）メソッドが提案されました。これは、ディープQネットワーク（DQN）とLRメソッドを組み合わせています。
ただし、LS-DQNメソッドは、アクションが個別であると想定しています。
この研究では、この制限に対処するために、二重最小二乗深い決定論的ポリシー勾配（DLS-DDPG）メソッドを提案します。
この方法では、LRメソッドと、継続的なアクションケースの代表的なディープ補強学習アルゴリズムの1つであるディープ決定論的ポリシーグラデーション（DDPG）手法を組み合わせています。
批評家ネットワークのLR更新では、DLS-DDPGは、LS-DQNが採用した方法であるフィットQイテレーションと同様のアルゴリズムを使用します。
さらに、Quasi-Newtonメソッドを使用して最適なアクションを計算し、Actor NetworkのLRアップデートのエージェントのアクションとトレーニングデータの両方として使用しました。
Mujoco環境で行われた数値実験は、少なくとも一部のタスクでは、提案された方法が少なくとも一部のタスクでパフォーマンスを改善することを示しましたが、正規化用語を小さくすることができないなどの困難があります。

要約(オリジナル)

The linear regression (LR) method offers the advantage that optimal parameters can be calculated relatively easily, although its representation capability is limited than that of the deep learning technique. To improve deep reinforcement learning, the Least Squares Deep Q Network (LS-DQN) method was proposed by Levine et al., which combines Deep Q Network (DQN) with LR method. However, the LS-DQN method assumes that the actions are discrete. In this study, we propose the Double Least Squares Deep Deterministic Policy Gradient (DLS-DDPG) method to address this limitation. This method combines the LR method with the Deep Deterministic Policy Gradient (DDPG) technique, one of the representative deep reinforcement learning algorithms for continuous action cases. For the LR update of the critic network, DLS-DDPG uses an algorithm similar to the Fitted Q iteration, the method which LS-DQN adopted. In addition, we calculated the optimal action using the quasi-Newton method and used it as both the agent’s action and the training data for the LR update of the actor network. Numerical experiments conducted in MuJoCo environments showed that the proposed method improved performance at least in some tasks, although there are difficulties such as the inability to make the regularization terms small.

arxiv情報

著者	Hisato Komatsu
発行日	2025-04-25 14:36:54+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント