https://github.com/Gen-Verse/ReasonFlux

\n","updatedAt":"2025-02-11T03:49:56.403Z","author":{"_id":"64fde4e252e82dd432b74ce9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64fde4e252e82dd432b74ce9/-CQZbBP7FsPPyawYrsi4z.jpeg","fullname":"Ling Yang","name":"Lingaaaaaaa","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8729113340377808},"editors":["Lingaaaaaaa"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64fde4e252e82dd432b74ce9/-CQZbBP7FsPPyawYrsi4z.jpeg"],"reactions":[],"isReport":false}},{"id":"67aacd07b9075c96773dd677","author":{"_id":"64fde4e252e82dd432b74ce9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64fde4e252e82dd432b74ce9/-CQZbBP7FsPPyawYrsi4z.jpeg","fullname":"Ling Yang","name":"Lingaaaaaaa","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10},"createdAt":"2025-02-11T04:07:35.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"\n![image.png](https://cdn-uploads.huggingface.co/production/uploads/64fde4e252e82dd432b74ce9/fEm32Cf67lhGqTReM-MX_.png)\n","html":"

\n","updatedAt":"2025-02-11T04:07:35.212Z","author":{"_id":"64fde4e252e82dd432b74ce9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64fde4e252e82dd432b74ce9/-CQZbBP7FsPPyawYrsi4z.jpeg","fullname":"Ling Yang","name":"Lingaaaaaaa","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":10}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.29929319024086},"editors":["Lingaaaaaaa"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64fde4e252e82dd432b74ce9/-CQZbBP7FsPPyawYrsi4z.jpeg"],"reactions":[],"isReport":false}},{"id":"67abfae68399b0276a9b9918","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":264},"createdAt":"2025-02-12T01:35:34.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking](https://huggingface.co/papers/2502.02339) (2025)\n* [rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking](https://huggingface.co/papers/2501.04519) (2025)\n* [Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling](https://huggingface.co/papers/2501.11651) (2025)\n* [O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning](https://huggingface.co/papers/2501.12570) (2025)\n* [Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search](https://huggingface.co/papers/2502.02508) (2025)\n* [BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning](https://huggingface.co/papers/2501.03226) (2025)\n* [URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics](https://huggingface.co/papers/2501.04686) (2025)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-02-12T01:35:34.202Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":264}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7425669431686401},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2502.06772","authors":[{"_id":"67aac8adfe33f6d8d695bc40","user":{"_id":"64fde4e252e82dd432b74ce9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64fde4e252e82dd432b74ce9/-CQZbBP7FsPPyawYrsi4z.jpeg","isPro":false,"fullname":"Ling Yang","user":"Lingaaaaaaa","type":"user"},"name":"Ling Yang","status":"claimed_verified","statusLastChangedAt":"2025-02-11T14:25:31.970Z","hidden":false},{"_id":"67aac8adfe33f6d8d695bc41","name":"Zhaochen Yu","hidden":false},{"_id":"67aac8adfe33f6d8d695bc42","name":"Bin Cui","hidden":false},{"_id":"67aac8adfe33f6d8d695bc43","name":"Mengdi Wang","hidden":false}],"publishedAt":"2025-02-10T18:51:47.000Z","submittedOnDailyAt":"2025-02-11T01:19:56.390Z","title":"ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates","submittedOnDailyBy":{"_id":"64fde4e252e82dd432b74ce9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64fde4e252e82dd432b74ce9/-CQZbBP7FsPPyawYrsi4z.jpeg","isPro":false,"fullname":"Ling Yang","user":"Lingaaaaaaa","type":"user"},"summary":"We present that hierarchical LLM reasoning via scaling thought templates can\neffectively optimize the reasoning search space and outperform the mathematical\nreasoning capabilities of powerful LLMs like OpenAI o1-preview and DeepSeek V3.\nWe train our ReasonFlux-32B model with only 8 GPUs and introduces three\ninnovations: (i) a structured and generic thought template library, containing\naround 500 high-level thought templates capable of generalizing to similar or\nrelevant reasoning problems; (ii) performing hierarchical reinforcement\nlearning on a sequence of thought templates instead of long CoTs, optimizing a\nbase LLM to plan out an optimal template trajectory for gradually handling\ncomplex problems; (iii) a brand new inference scaling system that enables\nhierarchical LLM reasoning by adaptively scaling thought templates at inference\ntime. With a template trajectory containing sequential thought templates, our\nReasonFlux-32B significantly advances math reasoning capabilities to\nstate-of-the-art levels. Notably, on the MATH benchmark, it achieves an\naccuracy of 91.2% and surpasses o1-preview by 6.7%. On the USA Math Olympiad\n(AIME) benchmark, ReasonFlux-32B solves an average of 56.7% of problems,\nsurpassing o1-preview and DeepSeek-V3 by 27% and 45%, respectively. Code:\nhttps://github.com/Gen-Verse/ReasonFlux","upvotes":22,"discussionId":"67aac8affe33f6d8d695bcbd","ai_summary":"Hierarchical reasoning with LLMs using scaled thought templates improves mathematical reasoning and outperforms existing models on benchmarks like MATH and AIME.","ai_keywords":["hierarchical LLM reasoning","thought templates","mathematical reasoning","reinforcement learning","template trajectory","inference scaling system","ReasonFlux-32B","MATH benchmark","USA Math Olympiad","AIME benchmark"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64fde4e252e82dd432b74ce9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64fde4e252e82dd432b74ce9/-CQZbBP7FsPPyawYrsi4z.jpeg","isPro":false,"fullname":"Ling Yang","user":"Lingaaaaaaa","type":"user"},{"_id":"6662a23c2f86097c6d828b96","avatarUrl":"/avatars/2aa31ab30874257529861f2e4024acc2.svg","isPro":false,"fullname":"liu","user":"miao6","type":"user"},{"_id":"6662a2ac9ced3e13879c524d","avatarUrl":"/avatars/fa5bb180daad40171c0fde6f5ce081f7.svg","isPro":false,"fullname":"liu","user":"miao66","type":"user"},{"_id":"6662a59cf8d1fcc749cbc5de","avatarUrl":"/avatars/0e965b6b996c154b8d39106c0cc5178d.svg","isPro":false,"fullname":"liu","user":"miao99","type":"user"},{"_id":"6560d75d6ff1b91e28e3cd7b","avatarUrl":"/avatars/bf205b47c71b197c56414ad1aaae3453.svg","isPro":false,"fullname":"js","user":"rldy","type":"user"},{"_id":"62d81b2b14bc83f0febefc2e","avatarUrl":"/avatars/d6520e85d1cead2249d29becaf311e1d.svg","isPro":false,"fullname":"Felix Tuma","user":"floom","type":"user"},{"_id":"66f612b934b8ac9ffa44f084","avatarUrl":"/avatars/6836c122e19c66c90f1673f28b30d7f0.svg","isPro":false,"fullname":"Tang","user":"tommysally","type":"user"},{"_id":"650c8bfb3d3542884da1a845","avatarUrl":"/avatars/863a5deebf2ac6d4faedc4dd368e0561.svg","isPro":false,"fullname":"Adhurim ","user":"Limi07","type":"user"},{"_id":"634dffc49b777beec3bc6448","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670144568552-634dffc49b777beec3bc6448.jpeg","isPro":false,"fullname":"Zhipeng Yang","user":"svjack","type":"user"},{"_id":"60db4e209f0b7223ea4560af","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1624985089254-noauth.jpeg","isPro":false,"fullname":"Thomas Ferraz","user":"thomas-ferraz","type":"user"},{"_id":"6270324ebecab9e2dcf245de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6270324ebecab9e2dcf245de/cMbtWSasyNlYc9hvsEEzt.jpeg","isPro":false,"fullname":"Kye Gomez","user":"kye","type":"user"},{"_id":"651c80a26ba9ab9b9582c273","avatarUrl":"/avatars/e963452eafd21f517d800f2e58e0f918.svg","isPro":false,"fullname":"siyeng feng","user":"siyengfeng","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">

Papers

arxiv:2502.06772

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates

Published on Feb 10

· Submitted by

Ling Yang on Feb 11

Upvote

Authors:

Ling Yang ,

Abstract

Hierarchical reasoning with LLMs using scaled thought templates improves mathematical reasoning and outperforms existing models on benchmarks like MATH and AIME.

AI-generated summary

We present that hierarchical LLM reasoning via scaling thought templates can effectively optimize the reasoning search space and outperform the mathematical reasoning capabilities of powerful LLMs like OpenAI o1-preview and DeepSeek V3. We train our ReasonFlux-32B model with only 8 GPUs and introduces three innovations: (i) a structured and generic thought template library, containing around 500 high-level thought templates capable of generalizing to similar or relevant reasoning problems; (ii) performing hierarchical reinforcement learning on a sequence of thought templates instead of long CoTs, optimizing a base LLM to plan out an optimal template trajectory for gradually handling complex problems; (iii) a brand new inference scaling system that enables hierarchical LLM reasoning by adaptively scaling thought templates at inference time. With a template trajectory containing sequential thought templates, our ReasonFlux-32B significantly advances math reasoning capabilities to state-of-the-art levels. Notably, on the MATH benchmark, it achieves an accuracy of 91.2% and surpasses o1-preview by 6.7%. On the USA Math Olympiad (AIME) benchmark, ReasonFlux-32B solves an average of 56.7% of problems, surpassing o1-preview and DeepSeek-V3 by 27% and 45%, respectively. Code: https://github.com/Gen-Verse/ReasonFlux