lynx   »   [go: up one dir, main page]

Text-to-Image
Diffusers
Safetensors
StableDiffusionPipeline
stable-diffusion
\"image.png\"

\n

Input Nodes for SD2.1:
sample - [2,4,64,64],timestep [-1] and encoder_hidden_states [2,77,768]

\n

With these inputs the output was proper for SD1.4 models also tried using the DPMSolverMultistepScheduler for SD2.1 still the output is the same.

\n

Saw somewhere the encoder_hidden_states blob shape was updated ? What are the right dimensions to be used ?

\n","updatedAt":"2023-02-20T12:44:50.681Z","author":{"_id":"63a031c0f3334a6553d2e01a","avatarUrl":"/avatars/e5440e4cc953c735816d8d16cb971979.svg","fullname":"Lalith Kumar V","name":"lalith-mcw","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false}},"numEdits":0,"editors":["lalith-mcw"],"editorAvatarUrls":["/avatars/e5440e4cc953c735816d8d16cb971979.svg"],"reactions":[],"isReport":false}}],"pinned":false,"locked":false,"collection":"discussions","isPullRequest":false,"isReport":false},"repo":{"name":"stabilityai/stable-diffusion-2-1","type":"model"},"activeTab":"discussion","discussionRole":0,"watched":false,"muted":false,"repoDiscussionsLocked":false}">

[SD2.1] - Input shapes for unet model

#40
by lalith-mcw - opened
\"image.png\"

\n

Input Nodes for SD2.1:
sample - [2,4,64,64],timestep [-1] and encoder_hidden_states [2,77,768]

\n

With these inputs the output was proper for SD1.4 models also tried using the DPMSolverMultistepScheduler for SD2.1 still the output is the same.

\n

Saw somewhere the encoder_hidden_states blob shape was updated ? What are the right dimensions to be used ?

\n","updatedAt":"2023-02-20T12:44:50.681Z","author":{"_id":"63a031c0f3334a6553d2e01a","avatarUrl":"/avatars/e5440e4cc953c735816d8d16cb971979.svg","fullname":"Lalith Kumar V","name":"lalith-mcw","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false}},"numEdits":0,"editors":["lalith-mcw"],"editorAvatarUrls":["/avatars/e5440e4cc953c735816d8d16cb971979.svg"],"reactions":[],"isReport":false}}],"pinned":false,"locked":false,"collection":"discussions","isPullRequest":false,"isReport":false},"primaryEmailConfirmed":false,"repo":{"name":"stabilityai/stable-diffusion-2-1","type":"model"},"discussionRole":0,"acceptLanguages":["*"],"hideComments":true,"repoDiscussionsLocked":false,"isDiscussionAuthor":false}">

Trying to run via Openvino IR - inferencing a pixelated image currently

Input Nodes for SD2.1:
sample - [2,4,64,64],timestep [-1] and encoder_hidden_states [2,77,1024]

Still I do get the inferenced image as 512x512 since vae_decoder takes latents input of shape 512x512 and that results in a pixelated image. What are the shapes used for the above three nodes for proper inferencing
image.png

Input Nodes for SD2.1:
sample - [2,4,64,64],timestep [-1] and encoder_hidden_states [2,77,768]

With these inputs the output was proper for SD1.4 models also tried using the DPMSolverMultistepScheduler for SD2.1 still the output is the same.

Saw somewhere the encoder_hidden_states blob shape was updated ? What are the right dimensions to be used ?

Sign up or log in to comment

Лучший частный хостинг