lynx   »   [go: up one dir, main page]

Librarian Bot. I found the following papers similar to this paper.

\n

The following papers were recommended by the Semantic Scholar API

\n\n

Please give a thumbs up to this comment if you found it helpful!

\n

If you want recommendations for any Paper on Hugging Face checkout this Space

\n

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot\n\t recommend

\n","updatedAt":"2025-09-16T01:35:09.197Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":264}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6982460618019104},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}},{"id":"68d3aa8e51fa40ee58d86b48","author":{"_id":"6204304ddec9ca07e4411eca","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6204304ddec9ca07e4411eca/-sJPvPY6nyCEuQlyCZZda.png","fullname":"Markus Heiervang","name":"marksverdhei","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11},"createdAt":"2025-09-24T08:23:42.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Very interesting paper. I do wonder if this method can be used for native low-resolution image generation too, such as pixel art. The lower end of the 'reliable exploration' is 256, but I'm wondering if sub 256 was unexplored due to an assumption that low res images aren't desirable.\n__True__ arbitrary resolution should also generalize on the extreme low end, right?\n","html":"

Very interesting paper. I do wonder if this method can be used for native low-resolution image generation too, such as pixel art. The lower end of the 'reliable exploration' is 256, but I'm wondering if sub 256 was unexplored due to an assumption that low res images aren't desirable.
True arbitrary resolution should also generalize on the extreme low end, right?

\n","updatedAt":"2025-09-24T08:23:42.572Z","author":{"_id":"6204304ddec9ca07e4411eca","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6204304ddec9ca07e4411eca/-sJPvPY6nyCEuQlyCZZda.png","fullname":"Markus Heiervang","name":"marksverdhei","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":11}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9299039244651794},"editors":["marksverdhei"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/6204304ddec9ca07e4411eca/-sJPvPY6nyCEuQlyCZZda.png"],"reactions":[],"isReport":false},"replies":[{"id":"68d4c47d110abfef2a110351","createdAt":"2025-09-25T04:26:37.000Z","type":"comment","data":{"edited":true,"hidden":true,"hiddenBy":"","hiddenReason":"Spam","latest":{"raw":"This comment has been hidden","html":"This comment has been hidden","updatedAt":"2025-09-25T04:29:49.263Z"},"numEdits":0,"editors":[],"editorAvatarUrls":[],"reactions":[],"parentCommentId":"68d3aa8e51fa40ee58d86b48"}},{"id":"68d4c5731e0b58a74a81e5b3","author":{"_id":"65f3f43fc9940817ca9a427b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65f3f43fc9940817ca9a427b/02NN3XjSsbgWDhjrJWtVL.jpeg","fullname":"Wanghan Xu","name":"CoCoOne","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2},"createdAt":"2025-09-25T04:30:43.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Thank you for your interest in our work. This paper focuses on generating high-resolution images, with our experiments primarily centered on resolutions of 256 or higher. Our specialized decoder is designed for this purpose; for lower resolutions (e.g., 128), the original VAE decoder is already a mature and effective solution, so our decoder isn't necessary. We are very appreciate your idea that the arbitary-scale should include low-resolution images. Our model also support the generation of low-resolution images, cause high-resolution outputs can always be downsampled to create excellent low-resolution versions.","html":"

Thank you for your interest in our work. This paper focuses on generating high-resolution images, with our experiments primarily centered on resolutions of 256 or higher. Our specialized decoder is designed for this purpose; for lower resolutions (e.g., 128), the original VAE decoder is already a mature and effective solution, so our decoder isn't necessary. We are very appreciate your idea that the arbitary-scale should include low-resolution images. Our model also support the generation of low-resolution images, cause high-resolution outputs can always be downsampled to create excellent low-resolution versions.

\n","updatedAt":"2025-09-25T04:30:43.802Z","author":{"_id":"65f3f43fc9940817ca9a427b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65f3f43fc9940817ca9a427b/02NN3XjSsbgWDhjrJWtVL.jpeg","fullname":"Wanghan Xu","name":"CoCoOne","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9411737322807312},"editors":["CoCoOne"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/65f3f43fc9940817ca9a427b/02NN3XjSsbgWDhjrJWtVL.jpeg"],"reactions":[{"reaction":"👍","users":["marksverdhei"],"count":1}],"isReport":false,"parentCommentId":"68d3aa8e51fa40ee58d86b48"}}]}],"primaryEmailConfirmed":false,"paper":{"id":"2509.10441","authors":[{"_id":"68c76c47ee0eed1697d6b662","name":"Tao Han","hidden":false},{"_id":"68c76c47ee0eed1697d6b663","user":{"_id":"65f3f43fc9940817ca9a427b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65f3f43fc9940817ca9a427b/02NN3XjSsbgWDhjrJWtVL.jpeg","isPro":false,"fullname":"Wanghan Xu","user":"CoCoOne","type":"user"},"name":"Wanghan Xu","status":"claimed_verified","statusLastChangedAt":"2025-09-15T15:07:54.551Z","hidden":false},{"_id":"68c76c47ee0eed1697d6b664","name":"Junchao Gong","hidden":false},{"_id":"68c76c47ee0eed1697d6b665","user":{"_id":"6662a450b1fff5575fdf0fbd","avatarUrl":"/avatars/2a6065269f1980213625a9cfd8d42fbd.svg","isPro":false,"fullname":"Xiaoyu Yue","user":"YueXY233","type":"user"},"name":"Xiaoyu Yue","status":"claimed_verified","statusLastChangedAt":"2025-09-16T09:43:17.062Z","hidden":false},{"_id":"68c76c47ee0eed1697d6b666","name":"Song Guo","hidden":false},{"_id":"68c76c47ee0eed1697d6b667","name":"Luping Zhou","hidden":false},{"_id":"68c76c47ee0eed1697d6b668","name":"Lei Bai","hidden":false}],"publishedAt":"2025-09-12T17:48:57.000Z","submittedOnDailyAt":"2025-09-15T00:01:05.617Z","title":"InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis","submittedOnDailyBy":{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},"summary":"Arbitrary resolution image generation provides a consistent visual experience\nacross devices, having extensive applications for producers and consumers.\nCurrent diffusion models increase computational demand quadratically with\nresolution, causing 4K image generation delays over 100 seconds. To solve this,\nwe explore the second generation upon the latent diffusion models, where the\nfixed latent generated by diffusion models is regarded as the content\nrepresentation and we propose to decode arbitrary resolution images with a\ncompact generated latent using a one-step generator. Thus, we present the\nInfGen, replacing the VAE decoder with the new generator, for\ngenerating images at any resolution from a fixed-size latent without retraining\nthe diffusion models, which simplifies the process, reducing computational\ncomplexity and can be applied to any model using the same latent space.\nExperiments show InfGen is capable of improving many models into the arbitrary\nhigh-resolution era while cutting 4K image generation time to under 10 seconds.","upvotes":30,"discussionId":"68c76c48ee0eed1697d6b669","ai_summary":"InfGen, a one-step generator replacing the VAE decoder, enables arbitrary high-resolution image generation from a fixed-size latent, significantly reducing computational complexity and generation time.","ai_keywords":["diffusion models","latent diffusion models","VAE decoder","one-step generator","arbitrary resolution","computational complexity","image generation time"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"645e0a5ca11438270a15c63d","avatarUrl":"/avatars/26b795345c66398ebdb154aa357a020c.svg","isPro":false,"fullname":"quangdq","user":"kaidduong","type":"user"},{"_id":"6662a450b1fff5575fdf0fbd","avatarUrl":"/avatars/2a6065269f1980213625a9cfd8d42fbd.svg","isPro":false,"fullname":"Xiaoyu Yue","user":"YueXY233","type":"user"},{"_id":"65f3f43fc9940817ca9a427b","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65f3f43fc9940817ca9a427b/02NN3XjSsbgWDhjrJWtVL.jpeg","isPro":false,"fullname":"Wanghan Xu","user":"CoCoOne","type":"user"},{"_id":"6684a72f74af0ef94892a3fa","avatarUrl":"/avatars/69c8bb5696f55a83aab627316a629ba8.svg","isPro":false,"fullname":"XUMING HE","user":"hexmSeeU","type":"user"},{"_id":"64130b8ede0e5470a370a8f0","avatarUrl":"/avatars/cb53d37928eea66c343dc53d06b62fd0.svg","isPro":false,"fullname":"WANG Jiong","user":"wjwow","type":"user"},{"_id":"639db46a6f45b49b2fae49b3","avatarUrl":"/avatars/7bfede9e51dd651c36923bbfb146a99c.svg","isPro":false,"fullname":"ZijieGuo","user":"WillGuo","type":"user"},{"_id":"65082baabc8788c4064d5360","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/NZpFVnTpPGcCe8mvFMD-L.jpeg","isPro":false,"fullname":"Xiangyuan Xue","user":"xxyQwQ","type":"user"},{"_id":"6427e08288215cee63b1c44d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6427e08288215cee63b1c44d/rzaG978FF-ywzicWNl_xl.jpeg","isPro":false,"fullname":"yao teng","user":"tytyt","type":"user"},{"_id":"61585e723db5b9d8243ba044","avatarUrl":"/avatars/8fd55d402ce3d2cbf2c2a451489f8542.svg","isPro":false,"fullname":"Ibrahim.H","user":"bitsnaps","type":"user"},{"_id":"638f308fc4444c6ca870b60a","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/638f308fc4444c6ca870b60a/Q11NK-8-JbiilJ-vk2LAR.png","isPro":true,"fullname":"Linoy Tsaban","user":"linoyts","type":"user"},{"_id":"62447e04f555de1927a9c879","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1648655841478-noauth.png","isPro":false,"fullname":"jasonjiang","user":"mikinyaa","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":1}">
Papers
arxiv:2509.10441

InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis

Published on Sep 12
· Submitted by taesiri on Sep 15
#1 Paper of the day
Authors:
,
,
,
,

Abstract

InfGen, a one-step generator replacing the VAE decoder, enables arbitrary high-resolution image generation from a fixed-size latent, significantly reducing computational complexity and generation time.

AI-generated summary

Arbitrary resolution image generation provides a consistent visual experience across devices, having extensive applications for producers and consumers. Current diffusion models increase computational demand quadratically with resolution, causing 4K image generation delays over 100 seconds. To solve this, we explore the second generation upon the latent diffusion models, where the fixed latent generated by diffusion models is regarded as the content representation and we propose to decode arbitrary resolution images with a compact generated latent using a one-step generator. Thus, we present the InfGen, replacing the VAE decoder with the new generator, for generating images at any resolution from a fixed-size latent without retraining the diffusion models, which simplifies the process, reducing computational complexity and can be applied to any model using the same latent space. Experiments show InfGen is capable of improving many models into the arbitrary high-resolution era while cutting 4K image generation time to under 10 seconds.

Community

Paper submitter

Arbitrary resolution image generation provides a consistent visual experience across devices, having extensive applications for producers and consumers. Current diffusion models increase computational demand quadratically with resolution, causing 4K image generation delays over 100 seconds. To solve this, we explore the second generation upon the latent diffusion models, where the fixed latent generated by diffusion models is regarded as the content representation and we propose to decode arbitrary resolution images with a compact generated latent using a one-step generator. Thus, we present the InfGen, replacing the VAE decoder with the new generator, for generating images at any resolution from a fixed-size latent without retraining the diffusion models, which simplifies the process, reducing computational complexity and can be applied to any model using the same latent space. Experiments show InfGen is capable of improving many models into the arbitrary high-resolution era while cutting 4K image generation time to under 10 seconds.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Very interesting paper. I do wonder if this method can be used for native low-resolution image generation too, such as pixel art. The lower end of the 'reliable exploration' is 256, but I'm wondering if sub 256 was unexplored due to an assumption that low res images aren't desirable.
True arbitrary resolution should also generalize on the extreme low end, right?

·
deleted
This comment has been hidden (marked as Spam)

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2509.10441 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2509.10441 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2509.10441 in a Space README.md to link it from this page.

Collections including this paper 8

Лучший частный хостинг