MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter

要約

言語モデル (LM) は、さまざまな 1D テキスト関連タスクにおいて優れた分子理解能力を実証しています。
しかし、分子のトポロジー構造を理解する人間の専門家の重要な能力である 2D グラフ認識が本質的に欠けています。
このギャップを埋めるために、私たちは MolCA: クロスモーダルプロジェクターとユニモーダルアダプターを使用した分子グラフ言語モデリングを提案します。
MolCA を使用すると、LM (例: Gaoptica) がクロスモーダルプロジェクターを介してテキストベースとグラフベースの両方の分子内容を理解できるようになります。
具体的には、クロスモーダルプロジェクターは、グラフエンコーダーの表現空間と LM のテキスト空間を接続する Q-Former として実装されます。
さらに、MolCA は、LM を下流のタスクに効率的に適応させるために、ユニモーダルアダプター (つまり、LoRA) を採用しています。
クロスモーダル対比学習を介してLMとグラフエンコーダを結合する以前の研究とは異なり、MolCAはLMのオープンエンドテキスト生成機能を保持し、それを2Dグラフ情報で強化します。
その有効性を示すために、分子キャプション、IUPAC 名予測、分子テキスト検索のタスクに関して MolCA のベンチマークを広範囲に実施し、MolCA がベースラインを大幅に上回りました。
コードとチェックポイントは https://github.com/acharkq/MolCA で見つけることができます。

要約(オリジナル)

Language Models (LMs) have demonstrated impressive molecule understanding ability on various 1D text-related tasks. However, they inherently lack 2D graph perception – a critical ability of human professionals in comprehending molecules’ topological structures. To bridge this gap, we propose MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter. MolCA enables an LM (e.g., Galactica) to understand both text- and graph-based molecular contents via the cross-modal projector. Specifically, the cross-modal projector is implemented as a Q-Former to connect a graph encoder’s representation space and an LM’s text space. Further, MolCA employs a uni-modal adapter (i.e., LoRA) for the LM’s efficient adaptation to downstream tasks. Unlike previous studies that couple an LM with a graph encoder via cross-modal contrastive learning, MolCA retains the LM’s ability of open-ended text generation and augments it with 2D graph information. To showcase its effectiveness, we extensively benchmark MolCA on tasks of molecule captioning, IUPAC name prediction, and molecule-text retrieval, on which MolCA significantly outperforms the baselines. Our codes and checkpoints can be found at https://github.com/acharkq/MolCA.

arxiv情報

著者	Zhiyuan Liu,Sihang Li,Yanchen Luo,Hao Fei,Yixin Cao,Kenji Kawaguchi,Xiang Wang,Tat-Seng Chua
発行日	2024-01-18 09:03:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー