lynx   »   [go: up one dir, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-05-29T01:40:11.063Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":264}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7287148833274841},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2505.17005","authors":[{"_id":"683469f0df7cbb5c08a0498a","name":"Huatong Song","hidden":false},{"_id":"683469f0df7cbb5c08a0498b","name":"Jinhao Jiang","hidden":false},{"_id":"683469f0df7cbb5c08a0498c","name":"Wenqing Tian","hidden":false},{"_id":"683469f0df7cbb5c08a0498d","name":"Zhipeng Chen","hidden":false},{"_id":"683469f0df7cbb5c08a0498e","name":"Yuhuan Wu","hidden":false},{"_id":"683469f0df7cbb5c08a0498f","name":"Jiahao Zhao","hidden":false},{"_id":"683469f0df7cbb5c08a04990","user":{"_id":"6703ac76ea890f0ca5b225eb","avatarUrl":"/avatars/5f56c49a1940143d47dd484782a4abbf.svg","isPro":false,"fullname":"Yingqian Min","user":"EliverQ","type":"user"},"name":"Yingqian Min","status":"claimed_verified","statusLastChangedAt":"2025-05-28T09:00:55.996Z","hidden":false},{"_id":"683469f0df7cbb5c08a04991","name":"Wayne Xin Zhao","hidden":false},{"_id":"683469f0df7cbb5c08a04992","name":"Lei Fang","hidden":false},{"_id":"683469f0df7cbb5c08a04993","name":"Ji-Rong Wen","hidden":false}],"publishedAt":"2025-05-22T17:58:26.000Z","submittedOnDailyAt":"2025-05-28T06:24:31.531Z","title":"R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs\n via Reinforcement Learning","submittedOnDailyBy":{"_id":"6703ac76ea890f0ca5b225eb","avatarUrl":"/avatars/5f56c49a1940143d47dd484782a4abbf.svg","isPro":false,"fullname":"Yingqian Min","user":"EliverQ","type":"user"},"summary":"Large Language Models (LLMs) are powerful but prone to hallucinations due to\nstatic knowledge. Retrieval-Augmented Generation (RAG) helps by injecting\nexternal information, but current methods often are costly, generalize poorly,\nor ignore the internal knowledge of the model. In this paper, we introduce\nR1-Searcher++, a novel framework designed to train LLMs to adaptively leverage\nboth internal and external knowledge sources. R1-Searcher++ employs a two-stage\ntraining strategy: an initial SFT Cold-start phase for preliminary format\nlearning, followed by RL for Dynamic Knowledge Acquisition. The RL stage uses\noutcome-supervision to encourage exploration, incorporates a reward mechanism\nfor internal knowledge utilization, and integrates a memorization mechanism to\ncontinuously assimilate retrieved information, thereby enriching the model's\ninternal knowledge. By leveraging internal knowledge and external search\nengine, the model continuously improves its capabilities, enabling efficient\nretrieval-augmented reasoning. Our experiments demonstrate that R1-Searcher++\noutperforms previous RAG and reasoning methods and achieves efficient\nretrieval. The code is available at\nhttps://github.com/RUCAIBox/R1-Searcher-plus.","upvotes":5,"discussionId":"683469f1df7cbb5c08a049b2","ai_summary":"R1-Searcher++, a novel framework, enhances LLMs by adaptively integrating internal and external knowledge through two-stage training, improving retrieval-augmented reasoning efficiency and performance.","ai_keywords":["Large Language Models (LLMs)","hallucinations","Retrieval-Augmented Generation (RAG)","R1-Searcher++","SFT Cold-start","Reinforcement Learning (RL)","outcome-supervision","reward mechanism","memorization mechanism","retrieval-augmented reasoning"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6703ac76ea890f0ca5b225eb","avatarUrl":"/avatars/5f56c49a1940143d47dd484782a4abbf.svg","isPro":false,"fullname":"Yingqian Min","user":"EliverQ","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"648eb1eb59c4e5c87dc116e0","avatarUrl":"/avatars/c636cea39c2c0937f01398c94ead5dad.svg","isPro":false,"fullname":"fdsqefsgergd","user":"T-representer","type":"user"},{"_id":"651c80a26ba9ab9b9582c273","avatarUrl":"/avatars/e963452eafd21f517d800f2e58e0f918.svg","isPro":false,"fullname":"siyeng feng","user":"siyengfeng","type":"user"},{"_id":"64d4615cf8082bf19b916492","avatarUrl":"/avatars/8e1b59565ec5e4b31090cf1b911781b9.svg","isPro":false,"fullname":"wongyukim","user":"wongyukim","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Papers
arxiv:2505.17005

R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning

Published on May 22
· Submitted by Yingqian Min on May 28
Authors:
,
,
,
,
,
,
,
,

Abstract

R1-Searcher++, a novel framework, enhances LLMs by adaptively integrating internal and external knowledge through two-stage training, improving retrieval-augmented reasoning efficiency and performance.

AI-generated summary

Large Language Models (LLMs) are powerful but prone to hallucinations due to static knowledge. Retrieval-Augmented Generation (RAG) helps by injecting external information, but current methods often are costly, generalize poorly, or ignore the internal knowledge of the model. In this paper, we introduce R1-Searcher++, a novel framework designed to train LLMs to adaptively leverage both internal and external knowledge sources. R1-Searcher++ employs a two-stage training strategy: an initial SFT Cold-start phase for preliminary format learning, followed by RL for Dynamic Knowledge Acquisition. The RL stage uses outcome-supervision to encourage exploration, incorporates a reward mechanism for internal knowledge utilization, and integrates a memorization mechanism to continuously assimilate retrieved information, thereby enriching the model's internal knowledge. By leveraging internal knowledge and external search engine, the model continuously improves its capabilities, enabling efficient retrieval-augmented reasoning. Our experiments demonstrate that R1-Searcher++ outperforms previous RAG and reasoning methods and achieves efficient retrieval. The code is available at https://github.com/RUCAIBox/R1-Searcher-plus.

Community

Paper author Paper submitter

A novel framework that enables LLMs)to adaptively leverage both internal knowledge (pre-trained model knowledge) and external knowledge (retrieved information)

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2505.17005 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2505.17005 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2505.17005 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.
Лучший частный хостинг