lynx   »   [go: up one dir, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2024-02-21T01:22:41.522Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":264}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7593087553977966},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2402.11550","authors":[{"_id":"65d42906cf4034c90c00f943","name":"Jun Zhao","hidden":false},{"_id":"65d42906cf4034c90c00f944","name":"Can Zu","hidden":false},{"_id":"65d42906cf4034c90c00f945","name":"Hao Xu","hidden":false},{"_id":"65d42906cf4034c90c00f946","name":"Yi Lu","hidden":false},{"_id":"65d42906cf4034c90c00f947","user":{"_id":"66ecee857264238429a1211f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/66ecee857264238429a1211f/TbuM7ToLBrSxDF8mOccpK.jpeg","isPro":false,"fullname":"Wei He","user":"hewei2001","type":"user"},"name":"Wei He","status":"claimed_verified","statusLastChangedAt":"2024-10-25T09:32:16.410Z","hidden":false},{"_id":"65d42906cf4034c90c00f948","user":{"_id":"650819ad95c45596854271a3","avatarUrl":"/avatars/9517fd62eef104c33884dfb3f64249bf.svg","isPro":false,"fullname":"wen","user":"cindywen","type":"user"},"name":"Yiwen Ding","status":"claimed_verified","statusLastChangedAt":"2024-06-11T06:59:56.763Z","hidden":false},{"_id":"65d42906cf4034c90c00f949","name":"Tao Gui","hidden":false},{"_id":"65d42906cf4034c90c00f94a","name":"Qi Zhang","hidden":false},{"_id":"65d42906cf4034c90c00f94b","name":"Xuanjing Huang","hidden":false}],"publishedAt":"2024-02-18T11:46:52.000Z","submittedOnDailyAt":"2024-02-20T01:52:30.986Z","title":"LongAgent: Scaling Language Models to 128k Context through Multi-Agent\n Collaboration","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"Large language models (LLMs) have demonstrated impressive performance in\nunderstanding language and executing complex reasoning tasks. However, LLMs\nwith long context windows have been notorious for their expensive training\ncosts and high inference latency. Even the most advanced models such as GPT-4\nand Claude2 often make mistakes when processing inputs of over 100k tokens, a\nphenomenon also known as lost in the middle. In this paper, we propose\nLongAgent, a method based on multi-agent collaboration, which scales\nLLMs (e.g., LLaMA) to a context of 128K and demonstrates potential superiority\nin long-text processing compared to GPT-4. In LongAgent, a leader is\nresponsible for understanding user intent and directing team members to acquire\ninformation from documents. Due to members' hallucinations, it is non-trivial\nfor a leader to obtain accurate information from the responses of dozens to\nhundreds of members. To address this, we develop an inter-member\ncommunication mechanism to resolve response conflicts caused by hallucinations\nthrough information sharing. Our experimental results indicate that\nLongAgent offers a promising alternative for long-text processing. The\nagent team instantiated with LLaMA-7B achieves significant improvements in\ntasks such as 128k-long text retrieval, multi-hop question answering, compared\nto GPT-4.","upvotes":18,"discussionId":"65d42906cf4034c90c00f98e","ai_summary":"LongAgent, a multi-agent collaboration method, scales large language models to handle up to 128K tokens with improved accuracy and performance in long-text processing tasks compared to GPT-4.","ai_keywords":["multi-agent collaboration","context window","hallucinations","inter-member communication","long-text processing","LLaMA","text retrieval","multi-hop question answering"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6538119803519fddb4a17e10","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6538119803519fddb4a17e10/ffJMkdx-rM7VvLTCM6ri_.jpeg","isPro":false,"fullname":"samusenps","user":"samusenps","type":"user"},{"_id":"5ea8e6a70d1df220780f043b","avatarUrl":"/avatars/d81db3323d6c81b583dbaab54e03a9d0.svg","isPro":false,"fullname":"guolonghui","user":"guolonghui","type":"user"},{"_id":"6447908c4988ee01f2a2d304","avatarUrl":"/avatars/fd9003139cccf577b78d1a1b0cde6fab.svg","isPro":false,"fullname":"war","user":"mrzhaojun","type":"user"},{"_id":"6046db56de4e62b756b5a11f","avatarUrl":"/avatars/702b0eb530db72d9342a913f71ba5bf9.svg","isPro":false,"fullname":"Spencer","user":"spencer97","type":"user"},{"_id":"655ac762cb17ec19ef82719b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/655ac762cb17ec19ef82719b/1kDncYrGLYS_2SR8cNdAL.png","isPro":false,"fullname":"Welcome to matlok","user":"matlok","type":"user"},{"_id":"635964636a61954080850e1d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/635964636a61954080850e1d/0bfExuDTrHTtm8c-40cDM.png","isPro":false,"fullname":"William Lamkin","user":"phanes","type":"user"},{"_id":"64403d8d7663594a1263fdd4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64403d8d7663594a1263fdd4/9faL_ocHf6W2Jm6vR1zWl.png","isPro":false,"fullname":"Ahmed Khalil","user":"antiquesordo","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"648444f14bb88d273c4cab7f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/648444f14bb88d273c4cab7f/iGjkOf34bcB6K2-J5C_0U.jpeg","isPro":false,"fullname":"Irina Abdullaeva","user":"IrinaAbdullaeva","type":"user"},{"_id":"639c379cdb7c5f35004066cb","avatarUrl":"/avatars/3e435506ee85aa7d2d0ec2174a07462f.svg","isPro":false,"fullname":"Zhenran Xu","user":"imryanxu","type":"user"},{"_id":"6527e89a8808d80ccff88b7a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6527e89a8808d80ccff88b7a/CuGNmF1Et8KMQ0mCd1NEJ.jpeg","isPro":true,"fullname":"Hafedh Hichri","user":"not-lain","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Papers
arxiv:2402.11550

LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Published on Feb 18, 2024
· Submitted by AK on Feb 20, 2024
Authors:
,
,
,
,
Wei He ,
,
,

Abstract

LongAgent, a multi-agent collaboration method, scales large language models to handle up to 128K tokens with improved accuracy and performance in long-text processing tasks compared to GPT-4.

AI-generated summary

Large language models (LLMs) have demonstrated impressive performance in understanding language and executing complex reasoning tasks. However, LLMs with long context windows have been notorious for their expensive training costs and high inference latency. Even the most advanced models such as GPT-4 and Claude2 often make mistakes when processing inputs of over 100k tokens, a phenomenon also known as lost in the middle. In this paper, we propose LongAgent, a method based on multi-agent collaboration, which scales LLMs (e.g., LLaMA) to a context of 128K and demonstrates potential superiority in long-text processing compared to GPT-4. In LongAgent, a leader is responsible for understanding user intent and directing team members to acquire information from documents. Due to members' hallucinations, it is non-trivial for a leader to obtain accurate information from the responses of dozens to hundreds of members. To address this, we develop an inter-member communication mechanism to resolve response conflicts caused by hallucinations through information sharing. Our experimental results indicate that LongAgent offers a promising alternative for long-text processing. The agent team instantiated with LLaMA-7B achieves significant improvements in tasks such as 128k-long text retrieval, multi-hop question answering, compared to GPT-4.

Community

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2402.11550 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2402.11550 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2402.11550 in a Space README.md to link it from this page.

Collections including this paper 11

Лучший частный хостинг