Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix Factorization

要約

検索された生成（RAG）、ナレッジグラフ（KGS）、およびベクターストア（VSS）を備えた大規模な言語モデル（LLMS）を搭載したエージェント生成AIは、法的システム、研究、推奨システム、サイバーセキュリティ、グローバルセキュリティなどの専門ドメインに適用される変換技術を表します。
この技術は、広大な非構造化または半構造化データセット内の関係を推測することに優れています。
ここでの法的領域は、複雑な関係を備えた広範な、相互に関連し、半構造化された知識システムを特徴とする複雑なデータで構成されています。
憲法、法令、規制、判例法で構成されています。
洞察を抽出し、法的文書とその関係の複雑なネットワークをナビゲートすることは、効果的な法的研究のために重要です。
ここでは、法的情報の検索とAIの推論を強化し、幻覚を最小化するために、非陰性マトリックス因数分解（NMF）を介して構築されたRAG、VS、およびKGを統合する生成AIシステムを導入します。
法制度では、これらのテクノロジーは、AIエージェントがケース、法令、法的先例間の複雑なつながりを特定して分析し、隠された関係を明らかにし、正義を確保し、運用効率を改善するために不可欠な法的傾向に挑戦するタスクを予測できるようにします。
当社のシステムは、Webスクレイピングテクニックを採用して、Justiaなどの公開可能なプラットフォームから、法令、憲法規定、判例法などの法的テキストを体系的に収集します。
高度なセマンティック表現、階層的な関係、潜在的なトピックの発見を活用することにより、従来のキーワードベースの検索とコンテキストの理解との間のギャップを埋めます。
このフレームワークは、計算法とAIを進めながら、半構造化データのスケーラブルで解釈可能かつ正確な検索のために、法的文書のクラスタリング、要約、および相互参照をサポートします。

要約(オリジナル)

Agentic Generative AI, powered by Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG), Knowledge Graphs (KGs), and Vector Stores (VSs), represents a transformative technology applicable to specialized domains such as legal systems, research, recommender systems, cybersecurity, and global security, including proliferation research. This technology excels at inferring relationships within vast unstructured or semi-structured datasets. The legal domain here comprises complex data characterized by extensive, interrelated, and semi-structured knowledge systems with complex relations. It comprises constitutions, statutes, regulations, and case law. Extracting insights and navigating the intricate networks of legal documents and their relations is crucial for effective legal research. Here, we introduce a generative AI system that integrates RAG, VS, and KG, constructed via Non-Negative Matrix Factorization (NMF), to enhance legal information retrieval and AI reasoning and minimize hallucinations. In the legal system, these technologies empower AI agents to identify and analyze complex connections among cases, statutes, and legal precedents, uncovering hidden relationships and predicting legal trends-challenging tasks that are essential for ensuring justice and improving operational efficiency. Our system employs web scraping techniques to systematically collect legal texts, such as statutes, constitutional provisions, and case law, from publicly accessible platforms like Justia. It bridges the gap between traditional keyword-based searches and contextual understanding by leveraging advanced semantic representations, hierarchical relationships, and latent topic discovery. This framework supports legal document clustering, summarization, and cross-referencing, for scalable, interpretable, and accurate retrieval for semi-structured data while advancing computational law and AI.

arxiv情報

著者	Ryan C. Barron,Maksim E. Eren,Olga M. Serafimova,Cynthia Matuszek,Boian S. Alexandrov
発行日	2025-02-27 18:35:39+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix Factorization

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー