RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

要約

マルチモーダル大規模言語モデル (MLLM) は最近、マルチモーダルな理解、推論、対話において優れた機能を実証しました。
しかし、既存の MLLM は深刻な幻覚の問題を抱えており、関連する画像に事実に基づかないテキストを生成します。
この問題により、既存の MLLM は信頼できなくなり、現実世界の (特に一か八かの) アプリケーションでは実用的ではなくなります。
この課題に対処するために、人間によるきめ細かい修正フィードバックからの行動調整を通じて MLLM の信頼性を強化する RLHF-V を紹介します。
具体的には、RLHF-V は、幻覚に対するセグメントレベルの補正の形で人間の好みを収集し、人間のフィードバックに対して高密度の直接的な好みの最適化を実行します。
自動評価と人間による評価の両方における 5 つのベンチマークに関する包括的な実験により、RLHF-V が有望なデータと計算効率を備えた、大幅に信頼性の高い MLLM 動作を可能にすることが示されました。
注目すべきことに、RLHF-V は 1.4k の注釈付きデータサンプルを使用して、ベース MLLM の幻覚率を 34.8% 大幅に低減し、10k の注釈付きデータでトレーニングされた同時実行 LLaVA-RLHF を上回りました。
最終モデルは、オープンソース MLLM の間で信頼性において最先端のパフォーマンスを達成し、過度の一般化によって引き起こされる幻覚の防止において GPT-4V よりも優れた堅牢性を示しています。
コード、モデル、データは https://github.com/RLHF-V/RLHF-V でオープンソース化されています。

要約(オリジナル)

Multimodal Large Language Models (MLLMs) have recently demonstrated impressive capabilities in multimodal understanding, reasoning, and interaction. However, existing MLLMs prevalently suffer from serious hallucination problems, generating text that is not factually grounded in associated images. The problem makes existing MLLMs untrustworthy and thus impractical in real-world (especially high-stakes) applications. To address the challenge, we present RLHF-V, which enhances MLLM trustworthiness via behavior alignment from fine-grained correctional human feedback. Specifically, RLHF-V collects human preference in the form of segment-level corrections on hallucinations, and performs dense direct preference optimization over the human feedback. Comprehensive experiments on five benchmarks in both automatic and human evaluation show that, RLHF-V can enable substantially more trustworthy MLLM behaviors with promising data and computation efficiency. Remarkably, using 1.4k annotated data samples, RLHF-V significantly reduces the hallucination rate of the base MLLM by 34.8%, outperforming the concurrent LLaVA-RLHF trained on 10k annotated data. The final model achieves state-of-the-art performance in trustworthiness among open-source MLLMs, and shows better robustness than GPT-4V in preventing hallucinations aroused from over-generalization. We open-source our code, model, and data at https://github.com/RLHF-V/RLHF-V.

arxiv情報

著者	Tianyu Yu,Yuan Yao,Haoye Zhang,Taiwen He,Yifeng Han,Ganqu Cui,Jinyi Hu,Zhiyuan Liu,Hai-Tao Zheng,Maosong Sun,Tat-Seng Chua
発行日	2024-03-08 06:42:37+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー