Distill the Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation

要約

タイトル： Distill the Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation

要約：この論文では、多言語機械翻訳（MMT）における画像の有無に関する問題について説明されています。従来のMMTは、[画像、ソーステキスト、ターゲットテキスト]を揃える必要があるため、画像が必須でした。これが最も大きな問題であるため、この論文ではIKD-MMTという新しいフレームワークを提案しています。IKD-MMTでは、知識の転移を使用して、ソーステキストからマルチモーダルなフィーチャーを生成することができます。従来の多言語機械翻訳と比較して、我々の手法は画像無しの状態でも高い性能を達成し、Multi30kベンチマークの最高精度を出しました。

要点：

– この論文は、多言語機械翻訳における画像の必要性について説明しています。
– IKD-MMTという新しいフレームワークを提案し、ソーステキストからマルチモーダルなフィーチャーを生成することができます。
– この手法は、画像無しの状態でも高い性能を発揮し、Multi30kベンチマークの最高精度を出しました。
– ソースコードとデータは、GitHubで公開されています。

要約(オリジナル)

Past works on multimodal machine translation (MMT) elevate bilingual setup by incorporating additional aligned vision information. However, an image-must requirement of the multimodal dataset largely hinders MMT’s development — namely that it demands an aligned form of [image, source text, target text]. This limitation is generally troublesome during the inference phase especially when the aligned image is not provided as in the normal NMT setup. Thus, in this work, we introduce IKD-MMT, a novel MMT framework to support the image-free inference phase via an inversion knowledge distillation scheme. In particular, a multimodal feature generator is executed with a knowledge distillation module, which directly generates the multimodal feature from (only) source texts as the input. While there have been a few prior works entertaining the possibility to support image-free inference for machine translation, their performances have yet to rival the image-must translation. In our experiments, we identify our method as the first image-free approach to comprehensively rival or even surpass (almost) all image-must frameworks, and achieved the state-of-the-art result on the often-used Multi30k benchmark. Our code and data are available at: https://github.com/pengr/IKD-mmt/tree/master..

arxiv情報

著者	Ru Peng,Yawen Zeng,Junbo Zhao
発行日	2023-04-21 09:40:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

Distill the Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー