lynx   »   [go: up one dir, main page]

https://github.com/Alibaba-NLP/WebAgent!

\n","updatedAt":"2025-05-29T02:06:10.428Z","author":{"_id":"644a4fbc2166258fccc664bc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/8k3b44MbhQiWuo6i8BnYl.jpeg","fullname":"Jialong Wu","name":"callanwu","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":21}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7188334465026855},"editors":["callanwu"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/8k3b44MbhQiWuo6i8BnYl.jpeg"],"reactions":[],"isReport":false}},{"id":"6837cef40e3effe62fe57808","author":{"_id":"6538bdfdf5f5016df35f5faf","avatarUrl":"/avatars/054fc6f8cb46805e66de5c3c856d4fb9.svg","fullname":"Baixuan Li","name":"MuBai2001","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false},"createdAt":"2025-05-29T03:05:24.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"🕺Excellent work!","html":"

🕺Excellent work!

\n","updatedAt":"2025-05-29T03:05:24.273Z","author":{"_id":"6538bdfdf5f5016df35f5faf","avatarUrl":"/avatars/054fc6f8cb46805e66de5c3c856d4fb9.svg","fullname":"Baixuan Li","name":"MuBai2001","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6963614821434021},"editors":["MuBai2001"],"editorAvatarUrls":["/avatars/054fc6f8cb46805e66de5c3c856d4fb9.svg"],"reactions":[],"isReport":false}},{"id":"6837eab28f680552f7bca636","author":{"_id":"642bc03509d8df6721eee6dd","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/2wlq0qruaSjcTSHffcUYn.jpeg","fullname":"Lawrence Lai","name":"init0xyz","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false},"createdAt":"2025-05-29T05:03:46.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Nice work with an insightful idea","html":"

Nice work with an insightful idea

\n","updatedAt":"2025-05-29T05:03:46.377Z","author":{"_id":"642bc03509d8df6721eee6dd","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/2wlq0qruaSjcTSHffcUYn.jpeg","fullname":"Lawrence Lai","name":"init0xyz","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9115193486213684},"editors":["init0xyz"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/2wlq0qruaSjcTSHffcUYn.jpeg"],"reactions":[],"isReport":false}},{"id":"68390bbb1ff1973b73ffddc8","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":264},"createdAt":"2025-05-30T01:36:59.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging](https://huggingface.co/papers/2505.09316) (2025)\n* [SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis](https://huggingface.co/papers/2505.16834) (2025)\n* [WebThinker: Empowering Large Reasoning Models with Deep Research Capability](https://huggingface.co/papers/2504.21776) (2025)\n* [DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments](https://huggingface.co/papers/2504.03160) (2025)\n* [R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning](https://huggingface.co/papers/2505.17005) (2025)\n* [StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization](https://huggingface.co/papers/2505.15107) (2025)\n* [WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback](https://huggingface.co/papers/2505.20013) (2025)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-05-30T01:36:59.203Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":264}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7129849195480347},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2505.22648","authors":[{"_id":"6837c03cbbee677da73e6034","user":{"_id":"644a4fbc2166258fccc664bc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/8k3b44MbhQiWuo6i8BnYl.jpeg","isPro":false,"fullname":"Jialong Wu","user":"callanwu","type":"user"},"name":"Jialong Wu","status":"extracted_confirmed","statusLastChangedAt":"2025-07-04T08:14:49.966Z","hidden":false},{"_id":"6837c03cbbee677da73e6035","user":{"_id":"6538bdfdf5f5016df35f5faf","avatarUrl":"/avatars/054fc6f8cb46805e66de5c3c856d4fb9.svg","isPro":false,"fullname":"Baixuan Li","user":"MuBai2001","type":"user"},"name":"Baixuan Li","status":"admin_assigned","statusLastChangedAt":"2025-09-19T07:03:52.820Z","hidden":false},{"_id":"6837c03cbbee677da73e6036","user":{"_id":"63d32cd7b734eaa4d4fa410b","avatarUrl":"/avatars/68acb80f62bc6493e1ad26506999b6c4.svg","isPro":false,"fullname":"Runnan Fang","user":"Runnaning","type":"user"},"name":"Runnan Fang","status":"admin_assigned","statusLastChangedAt":"2025-09-19T07:04:00.170Z","hidden":false},{"_id":"6837c03cbbee677da73e6037","user":{"_id":"63fc4c00a3c067e62899d32b","avatarUrl":"/avatars/b54f2a406afdbbe2cd305d4d9f88ced2.svg","isPro":false,"fullname":"Wenbiao Yin","user":"NLPblue","type":"user"},"name":"Wenbiao Yin","status":"admin_assigned","statusLastChangedAt":"2025-09-19T07:04:07.608Z","hidden":false},{"_id":"6837c03cbbee677da73e6038","name":"Liwen Zhang","hidden":false},{"_id":"6837c03cbbee677da73e6039","name":"Zhengwei Tao","hidden":false},{"_id":"6837c03cbbee677da73e603a","name":"Dingchu Zhang","hidden":false},{"_id":"6837c03cbbee677da73e603b","name":"Zekun Xi","hidden":false},{"_id":"6837c03cbbee677da73e603c","name":"Yong Jiang","hidden":false},{"_id":"6837c03cbbee677da73e603d","user":{"_id":"63a091e42fabbbb89991f5ce","avatarUrl":"/avatars/d55485b06461764c36c9edf9d6e8892c.svg","isPro":false,"fullname":"pengjun xie","user":"xpjandy","type":"user"},"name":"Pengjun Xie","status":"admin_assigned","statusLastChangedAt":"2025-09-19T07:04:40.177Z","hidden":false},{"_id":"6837c03cbbee677da73e603e","name":"Fei Huang","hidden":false},{"_id":"6837c03cbbee677da73e603f","user":{"_id":"602f88f5e8149a962412a667","avatarUrl":"/avatars/b78f0e583df8e5d5e3365934fe5f4900.svg","isPro":false,"fullname":"Zhou","user":"Jingren","type":"user"},"name":"Jingren Zhou","status":"admin_assigned","statusLastChangedAt":"2025-09-19T07:04:55.277Z","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/644a4fbc2166258fccc664bc/vhAmZAlJqekE6vLcVjWtO.mp4","https://cdn-uploads.huggingface.co/production/uploads/644a4fbc2166258fccc664bc/RXQVwE9PRmBzxCURRiKAU.mp4","https://cdn-uploads.huggingface.co/production/uploads/644a4fbc2166258fccc664bc/woFRNvRqdUXKnAHpRH2eM.mp4"],"publishedAt":"2025-05-28T17:57:07.000Z","submittedOnDailyAt":"2025-05-29T00:34:30.750Z","title":"WebDancer: Towards Autonomous Information Seeking Agency","submittedOnDailyBy":{"_id":"644a4fbc2166258fccc664bc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/8k3b44MbhQiWuo6i8BnYl.jpeg","isPro":false,"fullname":"Jialong Wu","user":"callanwu","type":"user"},"summary":"Addressing intricate real-world problems necessitates in-depth information\nseeking and multi-step reasoning. Recent progress in agentic systems,\nexemplified by Deep Research, underscores the potential for autonomous\nmulti-step research. In this work, we present a cohesive paradigm for building\nend-to-end agentic information seeking agents from a data-centric and\ntraining-stage perspective. Our approach consists of four key stages: (1)\nbrowsing data construction, (2) trajectories sampling, (3) supervised\nfine-tuning for effective cold start, and (4) reinforcement learning for\nenhanced generalisation. We instantiate this framework in a web agent based on\nthe ReAct, WebDancer. Empirical evaluations on the challenging information\nseeking benchmarks, GAIA and WebWalkerQA, demonstrate the strong performance of\nWebDancer, achieving considerable results and highlighting the efficacy of our\ntraining paradigm. Further analysis of agent training provides valuable\ninsights and actionable, systematic pathways for developing more capable\nagentic models. The codes and demo will be released in\nhttps://github.com/Alibaba-NLP/WebAgent.","upvotes":31,"discussionId":"6837c03dbbee677da73e607f","githubRepo":"https://github.com/Alibaba-NLP/WebAgent","ai_summary":"The paper proposes a framework for building end-to-end agentic information seeking agents through a combination of data construction, trajectory sampling, supervised fine-tuning, and reinforcement learning, showcasing its effectiveness on information seeking benchmarks.","ai_keywords":["browsing data construction","trajectories sampling","supervised fine-tuning","reinforcement learning","WebDancer","GAIA","WebWalkerQA"],"githubStars":15091},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"644a4fbc2166258fccc664bc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/8k3b44MbhQiWuo6i8BnYl.jpeg","isPro":false,"fullname":"Jialong Wu","user":"callanwu","type":"user"},{"_id":"65efbcfa424023c1046fa78d","avatarUrl":"/avatars/806e52e2b5e2529ef532e06b6b629368.svg","isPro":false,"fullname":"pf lee","user":"Lipf","type":"user"},{"_id":"621ce1ef83df6dc7d4af60dd","avatarUrl":"/avatars/4adbc8a6a622e1212a423e22ccc58d09.svg","isPro":false,"fullname":"zcz","user":"zcccccz","type":"user"},{"_id":"6538bdfdf5f5016df35f5faf","avatarUrl":"/avatars/054fc6f8cb46805e66de5c3c856d4fb9.svg","isPro":false,"fullname":"Baixuan Li","user":"MuBai2001","type":"user"},{"_id":"642bc03509d8df6721eee6dd","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/2wlq0qruaSjcTSHffcUYn.jpeg","isPro":false,"fullname":"Lawrence Lai","user":"init0xyz","type":"user"},{"_id":"6445fd9ba56444c355dcbcba","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6445fd9ba56444c355dcbcba/NCyRCD-MK-yA0_qY6I2y0.png","isPro":false,"fullname":"Tianyu Fu","user":"fuvty","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"63082bb7bc0a2a5ee2253523","avatarUrl":"/avatars/6cf8d12d16d15db1070fbea89b5b3967.svg","isPro":false,"fullname":"Kuo-Hsin Tu","user":"dapumptu","type":"user"},{"_id":"65d9903fdceb54d42011a98d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65d9903fdceb54d42011a98d/5jnLeCY9sDtS98JyO9qzX.jpeg","isPro":false,"fullname":"meng shao","user":"meng-shao","type":"user"},{"_id":"678ddd806aa55e76bfffb953","avatarUrl":"/avatars/f447936c286a6a2d2874a760210b2f17.svg","isPro":false,"fullname":"Yong Jiang","user":"yongjiangNLP","type":"user"},{"_id":"648eb1eb59c4e5c87dc116e0","avatarUrl":"/avatars/c636cea39c2c0937f01398c94ead5dad.svg","isPro":false,"fullname":"fdsqefsgergd","user":"T-representer","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Papers
arxiv:2505.22648

WebDancer: Towards Autonomous Information Seeking Agency

Published on May 28
· Submitted by Jialong Wu on May 29
Authors:
,
,
,
,
,
,

Abstract

The paper proposes a framework for building end-to-end agentic information seeking agents through a combination of data construction, trajectory sampling, supervised fine-tuning, and reinforcement learning, showcasing its effectiveness on information seeking benchmarks.

AI-generated summary

Addressing intricate real-world problems necessitates in-depth information seeking and multi-step reasoning. Recent progress in agentic systems, exemplified by Deep Research, underscores the potential for autonomous multi-step research. In this work, we present a cohesive paradigm for building end-to-end agentic information seeking agents from a data-centric and training-stage perspective. Our approach consists of four key stages: (1) browsing data construction, (2) trajectories sampling, (3) supervised fine-tuning for effective cold start, and (4) reinforcement learning for enhanced generalisation. We instantiate this framework in a web agent based on the ReAct, WebDancer. Empirical evaluations on the challenging information seeking benchmarks, GAIA and WebWalkerQA, demonstrate the strong performance of WebDancer, achieving considerable results and highlighting the efficacy of our training paradigm. Further analysis of agent training provides valuable insights and actionable, systematic pathways for developing more capable agentic models. The codes and demo will be released in https://github.com/Alibaba-NLP/WebAgent.

Community

Paper author Paper submitter

Addressing intricate real-world problems necessitates in-depth information seeking and multi-step reasoning. Recent progress in agentic systems, exemplified by Deep Research, underscores the potential for autonomous multi-step research. In this work, we present a cohesive paradigm for building end-to-end agentic information seeking agents from a data-centric and training-stage perspective. Our approach consists of four key stages: (1) browsing data construction, (2) trajectories sampling, (3) supervised fine-tuning for effective cold start, and (4) reinforcement learning for enhanced generalisation. We instantiate this framework in a web agent based on the ReAct, WebDancer. Empirical evaluations on the challenging information seeking benchmarks, GAIA and WebWalkerQA, demonstrate the strong performance of WebDancer, achieving considerable results and highlighting the efficacy of our training paradigm. Further analysis of agent training provides valuable insights and actionable, systematic pathways for developing more capable agentic models.

Paper author Paper submitter
Paper author

🕺Excellent work!

Nice work with an insightful idea

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2505.22648 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 8

Лучший частный хостинг