Later, tried using without fp16 precision as -
\noptimum-cli export onnx --model google/gemma-7b --framework pt --task text-generation-with-past ./gemma-7b
\nAnd successfully got the onnx file in INT64 precision.
Now when I try to convert it to FP16/INT4 it is thowing an error as -
Any clues on what to do about this?
\n","updatedAt":"2024-04-24T12:06:48.351Z","author":{"_id":"65f816123a8814cf2f796770","avatarUrl":"/avatars/60d013f95b6cd5326e5ca85b0873368f.svg","fullname":"Navadeep Komarraju","name":"Avan2000","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5449169278144836},"editors":["Avan2000"],"editorAvatarUrls":["/avatars/60d013f95b6cd5326e5ca85b0873368f.svg"],"reactions":[],"isReport":false}},{"id":"6630b25ab6cca78994939a9b","author":{"_id":"65f816123a8814cf2f796770","avatarUrl":"/avatars/60d013f95b6cd5326e5ca85b0873368f.svg","fullname":"Navadeep Komarraju","name":"Avan2000","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isOwner":false,"isOrgMember":false},"createdAt":"2024-04-30T08:56:58.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hi Community, Any update on this ?","html":"Hi Community, Any update on this ?
\n","updatedAt":"2024-04-30T08:56:58.727Z","author":{"_id":"65f816123a8814cf2f796770","avatarUrl":"/avatars/60d013f95b6cd5326e5ca85b0873368f.svg","fullname":"Navadeep Komarraju","name":"Avan2000","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7172293066978455},"editors":["Avan2000"],"editorAvatarUrls":["/avatars/60d013f95b6cd5326e5ca85b0873368f.svg"],"reactions":[],"isReport":false}}],"pinned":false,"locked":false,"collection":"discussions","isPullRequest":false,"isReport":false},"repo":{"name":"google/gemma-7b","type":"model"},"activeTab":"discussion","discussionRole":0,"watched":false,"muted":false,"repoDiscussionsLocked":false}">Unable to convert ONNX model to INT4/FP16
Later, tried using without fp16 precision as -
\noptimum-cli export onnx --model google/gemma-7b --framework pt --task text-generation-with-past ./gemma-7b
\nAnd successfully got the onnx file in INT64 precision.
Now when I try to convert it to FP16/INT4 it is thowing an error as -
Any clues on what to do about this?
\n","updatedAt":"2024-04-24T12:06:48.351Z","author":{"_id":"65f816123a8814cf2f796770","avatarUrl":"/avatars/60d013f95b6cd5326e5ca85b0873368f.svg","fullname":"Navadeep Komarraju","name":"Avan2000","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5449169278144836},"editors":["Avan2000"],"editorAvatarUrls":["/avatars/60d013f95b6cd5326e5ca85b0873368f.svg"],"reactions":[],"isReport":false}},{"id":"6630b25ab6cca78994939a9b","author":{"_id":"65f816123a8814cf2f796770","avatarUrl":"/avatars/60d013f95b6cd5326e5ca85b0873368f.svg","fullname":"Navadeep Komarraju","name":"Avan2000","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isOwner":false,"isOrgMember":false},"createdAt":"2024-04-30T08:56:58.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Hi Community, Any update on this ?","html":"Hi Community, Any update on this ?
\n","updatedAt":"2024-04-30T08:56:58.727Z","author":{"_id":"65f816123a8814cf2f796770","avatarUrl":"/avatars/60d013f95b6cd5326e5ca85b0873368f.svg","fullname":"Navadeep Komarraju","name":"Avan2000","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7172293066978455},"editors":["Avan2000"],"editorAvatarUrls":["/avatars/60d013f95b6cd5326e5ca85b0873368f.svg"],"reactions":[],"isReport":false}}],"pinned":false,"locked":false,"collection":"discussions","isPullRequest":false,"isReport":false},"primaryEmailConfirmed":false,"repo":{"name":"google/gemma-7b","type":"model"},"discussionRole":0,"acceptLanguages":["*"],"hideComments":true,"repoDiscussionsLocked":false,"isDiscussionAuthor":false}">Hi community,
I tried converting the Gemma-7b model to onnx file with fp16 precision using the following command -
optimum-cli export onnx --dtype fp16 --device xpu --model google/gemma-7b --framework pt --task text-generation-with-past ./gemma-7b
But it is giving an error like -
Later, tried using without fp16 precision as -
optimum-cli export onnx --model google/gemma-7b --framework pt --task text-generation-with-past ./gemma-7b
And successfully got the onnx file in INT64 precision.
Now when I try to convert it to FP16/INT4 it is thowing an error as -
Any clues on what to do about this?
Hi Community, Any update on this ?