The following papers were recommended by the Semantic Scholar API
\n- \n
- RepViT-SAM: Towards Real-Time Segmenting Anything (2023) \n
- EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything (2023) \n
- EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM (2023) \n
- MobileSAMv2: Faster Segment Anything to Everything (2023) \n
- Stable Segment Anything Model (2023) \n
Please give a thumbs up to this comment if you found it helpful!
\nIf you want recommendations for any Paper on Hugging Face checkout this Space
\n","updatedAt":"2024-01-03T14:07:11.831Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":264}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7355853915214539},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2312.13789","authors":[{"_id":"6584f07641dbedb146fb2ca0","name":"Han Shu","hidden":false},{"_id":"6584f07641dbedb146fb2ca1","name":"Wenshuo Li","hidden":false},{"_id":"6584f07641dbedb146fb2ca2","user":{"_id":"64d5deb154bb9eb704f83122","avatarUrl":"/avatars/86ce09bcca903319051e2307581a43f4.svg","isPro":false,"fullname":"Yehui Tang","user":"tangyehui","type":"user"},"name":"Yehui Tang","status":"admin_assigned","statusLastChangedAt":"2023-12-22T11:21:14.139Z","hidden":false},{"_id":"6584f07641dbedb146fb2ca3","user":{"_id":"64be15c98496ee0fb60e4c22","avatarUrl":"/avatars/09e0b747e43c3aadeaa8a715226ee0c0.svg","isPro":false,"fullname":"Yiman Zhang","user":"YimanZhang","type":"user"},"name":"Yiman Zhang","status":"admin_assigned","statusLastChangedAt":"2023-12-22T11:21:20.400Z","hidden":false},{"_id":"6584f07641dbedb146fb2ca4","name":"Yihao Chen","hidden":false},{"_id":"6584f07641dbedb146fb2ca5","name":"Houqiang Li","hidden":false},{"_id":"6584f07641dbedb146fb2ca6","name":"Yunhe Wang","hidden":false},{"_id":"6584f07641dbedb146fb2ca7","user":{"_id":"65878c774f9d2b955e025019","avatarUrl":"/avatars/c039c29e131a8cba96b38048da94971b.svg","isPro":false,"fullname":"Xinghao Chen","user":"xinghaochen","type":"user"},"name":"Xinghao Chen","status":"claimed_verified","statusLastChangedAt":"2024-05-16T10:46:21.300Z","hidden":false}],"publishedAt":"2023-12-21T12:26:11.000Z","submittedOnDailyAt":"2023-12-21T23:42:07.910Z","title":"TinySAM: Pushing the Envelope for Efficient Segment Anything Model","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"Recently segment anything model (SAM) has shown powerful segmentation\ncapability and has drawn great attention in computer vision fields. Massive\nfollowing works have developed various applications based on the pretrained SAM\nand achieved impressive performance on downstream vision tasks. However, SAM\nconsists of heavy architectures and requires massive computational capacity,\nwhich hinders the further application of SAM on computation constrained edge\ndevices. To this end, in this paper we propose a framework to obtain a tiny\nsegment anything model (TinySAM) while maintaining the strong zero-shot\nperformance. We first propose a full-stage knowledge distillation method with\nonline hard prompt sampling strategy to distill a lightweight student model. We\nalso adapt the post-training quantization to the promptable segmentation task\nand further reduce the computational cost. Moreover, a hierarchical segmenting\neverything strategy is proposed to accelerate the everything inference by\n2times with almost no performance degradation. With all these proposed\nmethods, our TinySAM leads to orders of magnitude computational reduction and\npushes the envelope for efficient segment anything task. Extensive experiments\non various zero-shot transfer tasks demonstrate the significantly advantageous\nperformance of our TinySAM against counterpart methods. Pre-trained models and\ncodes will be available at https://github.com/xinghaochen/TinySAM and\nhttps://gitee.com/mindspore/models/tree/master/research/cv/TinySAM.","upvotes":15,"discussionId":"6584f07741dbedb146fb2cd4","ai_summary":"TinySAM reduces computational cost while maintaining strong zero-shot segmentation performance through knowledge distillation, quantization, and hierarchical segmentation strategies.","ai_keywords":["knowledge distillation","online hard prompt sampling","post-training quantization","hierarchical segmenting everything strategy","zero-shot segmentation"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6032802e1f993496bc14d9e3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6032802e1f993496bc14d9e3/w6hr-DEQot4VVkoyRIBiy.png","isPro":false,"fullname":"Omar Sanseviero","user":"osanseviero","type":"user"},{"_id":"620783f24e28382272337ba4","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/620783f24e28382272337ba4/zkUveQPNiDfYjgGhuFErj.jpeg","isPro":false,"fullname":"GuoLiangTang","user":"Tommy930","type":"user"},{"_id":"60dc215386932230e632cdeb","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60dc215386932230e632cdeb/7Tsyenn5aQsjgvM-JkcRu.jpeg","isPro":false,"fullname":"Miguel Guerrero","user":"apol","type":"user"},{"_id":"63053858acc17ce4ad3580e6","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63053858acc17ce4ad3580e6/Fg1bMOPRpOhk6xMhnCOi4.jpeg","isPro":false,"fullname":"Zhongpai Gao","user":"gaozhongpai","type":"user"},{"_id":"63c5d43ae2804cb2407e4d43","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1673909278097-noauth.png","isPro":false,"fullname":"xziayro","user":"xziayro","type":"user"},{"_id":"648eb1eb59c4e5c87dc116e0","avatarUrl":"/avatars/c636cea39c2c0937f01398c94ead5dad.svg","isPro":false,"fullname":"fdsqefsgergd","user":"T-representer","type":"user"},{"_id":"6538119803519fddb4a17e10","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6538119803519fddb4a17e10/ffJMkdx-rM7VvLTCM6ri_.jpeg","isPro":false,"fullname":"samusenps","user":"samusenps","type":"user"},{"_id":"6311bca0ae8896941da24e66","avatarUrl":"/avatars/48de64894fc3c9397e26e4d6da3ff537.svg","isPro":false,"fullname":"Fynn Kröger","user":"fynnkroeger","type":"user"},{"_id":"6483ec7ba7e293d57232992b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6483ec7ba7e293d57232992b/f7H517F82YMwf51qMXXrQ.jpeg","isPro":false,"fullname":"Jonathan LYS","user":"jonathan-lys","type":"user"},{"_id":"6454046d7d59f53d83f57841","avatarUrl":"/avatars/d8689d5d4ef7e7fdbedee03ab551c273.svg","isPro":false,"fullname":"Justin Dulay","user":"justin-shopcapsule","type":"user"},{"_id":"63d2a5dfb734eaa4d4f3e6da","avatarUrl":"/avatars/5e06c36b0d767890225ee4b953aeee52.svg","isPro":false,"fullname":"Riley Livingston","user":"rileylivingston1","type":"user"},{"_id":"6400bd1d330a45b036098a4f","avatarUrl":"/avatars/05a736f4f6e243a1e1e78bbed431f44a.svg","isPro":false,"fullname":"Yusei Kazama","user":"kazama07","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">TinySAM: Pushing the Envelope for Efficient Segment Anything Model
Abstract
TinySAM reduces computational cost while maintaining strong zero-shot segmentation performance through knowledge distillation, quantization, and hierarchical segmentation strategies.
Recently segment anything model (SAM) has shown powerful segmentation capability and has drawn great attention in computer vision fields. Massive following works have developed various applications based on the pretrained SAM and achieved impressive performance on downstream vision tasks. However, SAM consists of heavy architectures and requires massive computational capacity, which hinders the further application of SAM on computation constrained edge devices. To this end, in this paper we propose a framework to obtain a tiny segment anything model (TinySAM) while maintaining the strong zero-shot performance. We first propose a full-stage knowledge distillation method with online hard prompt sampling strategy to distill a lightweight student model. We also adapt the post-training quantization to the promptable segmentation task and further reduce the computational cost. Moreover, a hierarchical segmenting everything strategy is proposed to accelerate the everything inference by 2times with almost no performance degradation. With all these proposed methods, our TinySAM leads to orders of magnitude computational reduction and pushes the envelope for efficient segment anything task. Extensive experiments on various zero-shot transfer tasks demonstrate the significantly advantageous performance of our TinySAM against counterpart methods. Pre-trained models and codes will be available at https://github.com/xinghaochen/TinySAM and https://gitee.com/mindspore/models/tree/master/research/cv/TinySAM.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- RepViT-SAM: Towards Real-Time Segmenting Anything (2023)
- EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything (2023)
- EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM (2023)
- MobileSAMv2: Faster Segment Anything to Everything (2023)
- Stable Segment Anything Model (2023)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper