lynx   »   [go: up one dir, main page]

@guoday\n\t \n\n@ZhihongShao\n\t \n\n@HuajianXin\n\t \n\n@zqh11\n\t \n\n@Benjamin-eecs\n\t

\n","updatedAt":"2024-07-23T12:21:50.108Z","author":{"_id":"63a369d98c0c89dcae3b8329","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63a369d98c0c89dcae3b8329/AiH2zjy1cnt9OADAAZMLD.jpeg","fullname":"Adina Yakefu","name":"AdinaY","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":1269}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8123368620872498},"editors":["AdinaY"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/63a369d98c0c89dcae3b8329/AiH2zjy1cnt9OADAAZMLD.jpeg"],"reactions":[{"reaction":"👀","users":["sparsh35"],"count":1}],"isReport":false,"parentCommentId":"66509ba0a2f478299d767cb3"}}]},{"id":"6650b14ad6773125d08d2f22","author":{"_id":"6486638da4cf2081f20c40ec","avatarUrl":"/avatars/0bc16a7447cd71ac18828a678313bd83.svg","fullname":"Mike Young","name":"mikelabs","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":12},"createdAt":"2024-05-24T15:24:58.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Plain english rewrite of the paper - feedback welcome: [https://www.aimodels.fyi/papers/arxiv/deepseek-prover-advancing-theorem-proving-llms-through](https://www.aimodels.fyi/papers/arxiv/deepseek-prover-advancing-theorem-proving-llms-through)","html":"

Plain english rewrite of the paper - feedback welcome: https://www.aimodels.fyi/papers/arxiv/deepseek-prover-advancing-theorem-proving-llms-through

\n","updatedAt":"2024-05-24T15:24:58.215Z","author":{"_id":"6486638da4cf2081f20c40ec","avatarUrl":"/avatars/0bc16a7447cd71ac18828a678313bd83.svg","fullname":"Mike Young","name":"mikelabs","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":12}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5317160487174988},"editors":["mikelabs"],"editorAvatarUrls":["/avatars/0bc16a7447cd71ac18828a678313bd83.svg"],"reactions":[],"isReport":false}},{"id":"6664c50ee2b2dd57586c047c","author":{"_id":"6186ddf6a7717cb375090c01","avatarUrl":"/avatars/716b6a7d1094c8036b2a8a7b9063e8aa.svg","fullname":"Julien BLANCHON","name":"blanchon","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":142},"createdAt":"2024-06-08T20:54:38.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"# Advancing Theorem Proving with AI and Synthetic Data\n\nhttps://cdn-uploads.huggingface.co/production/uploads/6186ddf6a7717cb375090c01/tCR7JSzpuqYnXMTzrOHjF.mp4 \n\n## Links 🔗:\n👉 Subscribe: https://www.youtube.com/@Arxflix\n👉 Twitter: https://x.com/arxflix\n👉 LMNT (Partner): https://lmnt.com/\n\n\nBy Arxflix\n![9t4iCUHx_400x400-1.jpg](https://cdn-uploads.huggingface.co/production/uploads/6186ddf6a7717cb375090c01/v4S5zBurs0ouGNwYj1GEd.jpeg)","html":"

Advancing Theorem Proving with AI and Synthetic Data

\n

\n\n

Links 🔗:

\n

👉 Subscribe: https://www.youtube.com/@Arxflix
👉 Twitter: https://x.com/arxflix
👉 LMNT (Partner): https://lmnt.com/

\n

By Arxflix
\"9t4iCUHx_400x400-1.jpg\"

\n","updatedAt":"2024-06-08T20:54:38.471Z","author":{"_id":"6186ddf6a7717cb375090c01","avatarUrl":"/avatars/716b6a7d1094c8036b2a8a7b9063e8aa.svg","fullname":"Julien BLANCHON","name":"blanchon","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":142}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5820630788803101},"editors":["blanchon"],"editorAvatarUrls":["/avatars/716b6a7d1094c8036b2a8a7b9063e8aa.svg"],"reactions":[],"isReport":false}},{"id":"66bb184812a7d3bfeeb5bc01","author":{"_id":"5f729da2f1e7ef6e919a1c27","avatarUrl":"/avatars/bdf48f0d3290fab54d7e9636a3d14158.svg","fullname":"Zhanming (Allan) Jie","name":"allanjie","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6},"createdAt":"2024-08-13T08:24:40.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"waiting for the model!","html":"

waiting for the model!

\n","updatedAt":"2024-08-13T08:24:40.454Z","author":{"_id":"5f729da2f1e7ef6e919a1c27","avatarUrl":"/avatars/bdf48f0d3290fab54d7e9636a3d14158.svg","fullname":"Zhanming (Allan) Jie","name":"allanjie","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":6}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.58359694480896},"editors":["allanjie"],"editorAvatarUrls":["/avatars/bdf48f0d3290fab54d7e9636a3d14158.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2405.14333","authors":[{"_id":"66501a035a78417f01f909db","user":{"_id":"6532a060a78e70d19c669103","avatarUrl":"/avatars/3cc9309b0e31da0fb83f1c3ef87dbe9f.svg","isPro":false,"fullname":"HuajianXin","user":"HuajianXin","type":"user"},"name":"Huajian Xin","status":"admin_assigned","statusLastChangedAt":"2024-05-24T10:51:50.072Z","hidden":false},{"_id":"66501a035a78417f01f909dc","user":{"_id":"653df20eaa1f487614da4db1","avatarUrl":"/avatars/12b27ce2c59f53b7e464039deab36a5d.svg","isPro":false,"fullname":"Daya Guo","user":"guoday","type":"user"},"name":"Daya Guo","status":"admin_assigned","statusLastChangedAt":"2024-05-24T10:51:56.647Z","hidden":false},{"_id":"66501a035a78417f01f909dd","user":{"_id":"65db64f8b62d242ed8711701","avatarUrl":"/avatars/753e9f980eb6786c6b53b2f1becbf745.svg","isPro":false,"fullname":"Zhihong Shao","user":"ZhihongShao","type":"user"},"name":"Zhihong Shao","status":"admin_assigned","statusLastChangedAt":"2024-05-24T10:52:09.659Z","hidden":false},{"_id":"66501a035a78417f01f909de","user":{"_id":"62a8008aac97233f161e3ffe","avatarUrl":"/avatars/c5e759275c82823301780fb0504ef5d9.svg","isPro":false,"fullname":"Zhizhou Ren","user":"Stilwell","type":"user"},"name":"Zhizhou Ren","status":"admin_assigned","statusLastChangedAt":"2024-05-24T10:52:16.474Z","hidden":false},{"_id":"66501a035a78417f01f909df","user":{"_id":"63cd76b4374057a338e8e703","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63cd76b4374057a338e8e703/i4Qk5-0aYx3oRhC8b50aJ.jpeg","isPro":false,"fullname":"zhuqihao","user":"zqh11","type":"user"},"name":"Qihao Zhu","status":"admin_assigned","statusLastChangedAt":"2024-05-24T10:52:27.016Z","hidden":false},{"_id":"66501a035a78417f01f909e0","user":{"_id":"635e3a76106f984574c36409","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1667120725800-635e3a76106f984574c36409.png","isPro":false,"fullname":"Bo Liu","user":"Benjamin-eecs","type":"user"},"name":"Bo Liu","status":"claimed_verified","statusLastChangedAt":"2024-05-27T07:19:50.799Z","hidden":false},{"_id":"66501a035a78417f01f909e1","user":{"_id":"6398203609f12714ed1935c2","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6398203609f12714ed1935c2/uXgl0LgKnFYjq1Wz39-a6.jpeg","isPro":false,"fullname":"Chong Ruan","user":"Chester111","type":"user"},"name":"Chong Ruan","status":"claimed_verified","statusLastChangedAt":"2024-08-17T05:38:40.409Z","hidden":false},{"_id":"66501a035a78417f01f909e2","name":"Wenda Li","hidden":false},{"_id":"66501a035a78417f01f909e3","name":"Xiaodan Liang","hidden":false}],"publishedAt":"2024-05-23T09:03:42.000Z","submittedOnDailyAt":"2024-05-24T03:09:32.191Z","title":"DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale\n Synthetic Data","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"Proof assistants like Lean have revolutionized mathematical proof\nverification, ensuring high accuracy and reliability. Although large language\nmodels (LLMs) show promise in mathematical reasoning, their advancement in\nformal theorem proving is hindered by a lack of training data. To address this\nissue, we introduce an approach to generate extensive Lean 4 proof data derived\nfrom high-school and undergraduate-level mathematical competition problems.\nThis approach involves translating natural language problems into formal\nstatements, filtering out low-quality statements, and generating proofs to\ncreate synthetic data. After fine-tuning the DeepSeekMath 7B model on this\nsynthetic dataset, which comprises 8 million formal statements with proofs, our\nmodel achieved whole-proof generation accuracies of 46.3% with 64 samples and\n52% cumulatively on the Lean 4 miniF2F test, surpassing the baseline GPT-4 at\n23.0% with 64 samples and a tree search reinforcement learning method at 41.0%.\nAdditionally, our model successfully proved 5 out of 148 problems in the Lean 4\nFormalized International Mathematical Olympiad (FIMO) benchmark, while GPT-4\nfailed to prove any. These results demonstrate the potential of leveraging\nlarge-scale synthetic data to enhance theorem-proving capabilities in LLMs.\nBoth the synthetic dataset and the model will be made available to facilitate\nfurther research in this promising field.","upvotes":41,"discussionId":"66501a045a78417f01f90a10","ai_summary":"Generating extensive Lean 4 proof data from mathematical competition problems improved DeepSeekMath 7B's theorem-proving capabilities over GPT-4 and other methods.","ai_keywords":["Lean 4","DeepSeekMath 7B","formal theorem proving","natural language problems","formal statements","synthetic data","whole-proof generation","Lean 4 miniF2F test","FIMO benchmark"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"655ac762cb17ec19ef82719b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/655ac762cb17ec19ef82719b/1kDncYrGLYS_2SR8cNdAL.png","isPro":false,"fullname":"Welcome to matlok","user":"matlok","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"612246596d9ce900691744d2","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/612246596d9ce900691744d2/9DlHVQDqblKz7QPTA6nDa.jpeg","isPro":false,"fullname":"Edoardo Federici","user":"efederici","type":"user"},{"_id":"633a00248f27255b6b54ea5f","avatarUrl":"/avatars/8ad54c2d8a42093923cbdd6f15e0d7a7.svg","isPro":false,"fullname":"dfuhoiysOHSVFh82934gfjklb","user":"huba-buba","type":"user"},{"_id":"659180299e16fa7510840ac4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/659180299e16fa7510840ac4/I66LcNr3i35ehuzw1vR2Q.png","isPro":false,"fullname":"Ji-Ha","user":"Ji-Ha","type":"user"},{"_id":"63a369d98c0c89dcae3b8329","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63a369d98c0c89dcae3b8329/AiH2zjy1cnt9OADAAZMLD.jpeg","isPro":true,"fullname":"Adina Yakefu","user":"AdinaY","type":"user"},{"_id":"635964636a61954080850e1d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/635964636a61954080850e1d/0bfExuDTrHTtm8c-40cDM.png","isPro":false,"fullname":"William Lamkin","user":"phanes","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"60c8d264224e250fb0178f77","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60c8d264224e250fb0178f77/i8fbkBVcoFeJRmkQ9kYAE.png","isPro":true,"fullname":"Adam Lee","user":"Abecid","type":"user"},{"_id":"6538119803519fddb4a17e10","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6538119803519fddb4a17e10/ffJMkdx-rM7VvLTCM6ri_.jpeg","isPro":false,"fullname":"samusenps","user":"samusenps","type":"user"},{"_id":"6167229d6b5a2f75073ec599","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6167229d6b5a2f75073ec599/8o_8YndZpdjoFmpYpnU48.jpeg","isPro":false,"fullname":"Matt Mistele","user":"mmistele","type":"user"},{"_id":"5f43448a79c1ba4c353d0d8f","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5f43448a79c1ba4c353d0d8f/DiSygV3dn7A_OjmGVTrHD.jpeg","isPro":true,"fullname":"Sugato Ray","user":"sugatoray","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":1}">
Papers
arxiv:2405.14333

DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Published on May 23, 2024
· Submitted by AK on May 24, 2024
#1 Paper of the day
Authors:
Bo Liu ,
,

Abstract

Generating extensive Lean 4 proof data from mathematical competition problems improved DeepSeekMath 7B's theorem-proving capabilities over GPT-4 and other methods.

AI-generated summary

Proof assistants like Lean have revolutionized mathematical proof verification, ensuring high accuracy and reliability. Although large language models (LLMs) show promise in mathematical reasoning, their advancement in formal theorem proving is hindered by a lack of training data. To address this issue, we introduce an approach to generate extensive Lean 4 proof data derived from high-school and undergraduate-level mathematical competition problems. This approach involves translating natural language problems into formal statements, filtering out low-quality statements, and generating proofs to create synthetic data. After fine-tuning the DeepSeekMath 7B model on this synthetic dataset, which comprises 8 million formal statements with proofs, our model achieved whole-proof generation accuracies of 46.3% with 64 samples and 52% cumulatively on the Lean 4 miniF2F test, surpassing the baseline GPT-4 at 23.0% with 64 samples and a tree search reinforcement learning method at 41.0%. Additionally, our model successfully proved 5 out of 148 problems in the Lean 4 Formalized International Mathematical Olympiad (FIMO) benchmark, while GPT-4 failed to prove any. These results demonstrate the potential of leveraging large-scale synthetic data to enhance theorem-proving capabilities in LLMs. Both the synthetic dataset and the model will be made available to facilitate further research in this promising field.

Community

Any idea when the synthetic dataset and model will be up on the hub? 👀

·

still waiting it is written in the paper that they will opensource it

Advancing Theorem Proving with AI and Synthetic Data

Links 🔗:

👉 Subscribe: https://www.youtube.com/@Arxflix
👉 Twitter: https://x.com/arxflix
👉 LMNT (Partner): https://lmnt.com/

By Arxflix
9t4iCUHx_400x400-1.jpg

waiting for the model!

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 1

Spaces citing this paper 3

Collections including this paper 20

Лучший частный хостинг