Safeguarding Decentralized Social Media: LLM Agents for Automating Community Rule Compliance

要約

健全なオンラインソーシャル環境を維持するには、コンテンツがコミュニティガイドラインに準拠していることを確認することが重要です。
ただし、従来の人によるコンプライアンスチェックでは、ユーザー作成コンテンツの量が増加し、モデレータの数が限られているため、拡張に苦労しています。
大規模言語モデルによって実証された自然言語理解の最近の進歩により、自動コンテンツコンプライアンス検証の新たな機会が開かれています。
この研究では、コミュニティのスコープとルールが異質であるため困難な環境である分散型ソーシャルネットワークにおける自動ルールコンプライアンスチェックのために、Open-LLM 上に構築された 6 つの AI エージェントを評価します。
数百のマストドンサーバーからの 50,000 件を超える投稿を分析したところ、AI エージェントが非準拠コンテンツを効果的に検出し、言語の微妙な点を把握し、コミュニティの多様な状況に適応していることがわかりました。
また、ほとんどのエージェントは、評価者間の高い信頼性と、スコアの正当性とコンプライアンスの提案において一貫性を示しています。
ドメイン専門家による人間ベースの評価により、エージェントの信頼性と有用性が確認され、エージェントは半自動または人間参加型のコンテンツモデレーションシステムにとって有望なツールとなりました。

要約(オリジナル)

Ensuring content compliance with community guidelines is crucial for maintaining healthy online social environments. However, traditional human-based compliance checking struggles with scaling due to the increasing volume of user-generated content and a limited number of moderators. Recent advancements in Natural Language Understanding demonstrated by Large Language Models unlock new opportunities for automated content compliance verification. This work evaluates six AI-agents built on Open-LLMs for automated rule compliance checking in Decentralized Social Networks, a challenging environment due to heterogeneous community scopes and rules. Analyzing over 50,000 posts from hundreds of Mastodon servers, we find that AI-agents effectively detect non-compliant content, grasp linguistic subtleties, and adapt to diverse community contexts. Most agents also show high inter-rater reliability and consistency in score justification and suggestions for compliance. Human-based evaluation with domain experts confirmed the agents’ reliability and usefulness, rendering them promising tools for semi-automated or human-in-the-loop content moderation systems.

arxiv情報

著者	Lucio La Cava,Andrea Tagarelli
発行日	2024-09-13 16:29:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Safeguarding Decentralized Social Media: LLM Agents for Automating Community Rule Compliance

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー