Long-context Language Models Are Not Good At ALL Retrieval Tasks Without Sufficient Steps




Long-context language models (LCLMs), characterized by their extensive context window, are becoming popular. However, despite they are nearly perfect at standard long-context retrieval tasks, our evaluations demonstrate they are not good at 2 basic cases, ‘multi-matching retrieval,’ and ‘logic-based retrieval’, which are beyond LCLMs’ ability boundary. But we find they can be well addressed with a sufficient number of reasoning steps, guided by specific CoT prompts, indicating the necessity of combining long-context tasks with CoT methods for more advanced long context handling. However, current CoT methods are too time-consuming, when the context is very long, which means efficient long-context handling still has a long way to go.


著者 Yijiong Yu,Ma Xiufa,Fang Jianwei,Zhi Xu,Su Guangyao,Wang Jiancheng,Yongfeng Huang,Zhixiao Qi,Wei Wang,Weifeng Liu,Ran Chen,Ji Pei
発行日 2025-02-06 11:56:00+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

カテゴリー: cs.CL パーマリンク