\n","updatedAt":"2025-03-12T04:08:58.823Z","author":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","fullname":"AK","name":"akhaliq","type":"user","isPro":false,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":8212}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.2533748745918274},"editors":["akhaliq"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg"],"reactions":[],"isReport":false}},{"id":"67d12ac635066eade61f7fce","author":{"_id":"638b1440bbe083dfbba8de3c","avatarUrl":"/avatars/470274cfa638571b96c1c1adef469d13.svg","fullname":"Wa Haha","name":"wahaha1987","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false},"createdAt":"2025-03-12T06:33:42.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"open source?","html":"
open source?
\n","updatedAt":"2025-03-12T06:33:42.278Z","author":{"_id":"638b1440bbe083dfbba8de3c","avatarUrl":"/avatars/470274cfa638571b96c1c1adef469d13.svg","fullname":"Wa Haha","name":"wahaha1987","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.835456371307373},"editors":["wahaha1987"],"editorAvatarUrls":["/avatars/470274cfa638571b96c1c1adef469d13.svg"],"reactions":[],"isReport":false}},{"id":"67d388a828221b583a2d3e55","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":264},"createdAt":"2025-03-14T01:38:48.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [MMTEB: Massive Multilingual Text Embedding Benchmark](https://huggingface.co/papers/2502.13595) (2025)\n* [Enhancing Lexicon-Based Text Embeddings with Large Language Models](https://huggingface.co/papers/2501.09749) (2025)\n* [mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data](https://huggingface.co/papers/2502.08468) (2025)\n* [xVLM2Vec: Adapting LVLM-based embedding models to multilinguality using Self-Knowledge Distillation](https://huggingface.co/papers/2503.09313) (2025)\n* [FaMTEB: Massive Text Embedding Benchmark in Persian Language](https://huggingface.co/papers/2502.11571) (2025)\n* [DeepRAG: Building a Custom Hindi Embedding Model for Retrieval Augmented Generation from Scratch](https://huggingface.co/papers/2503.08213) (2025)\n* [Franken-Adapter: Cross-Lingual Adaptation of LLMs by Embedding Surgery](https://huggingface.co/papers/2502.08037) (2025)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"
This is an automated message from the Librarian Bot . I found the following papers similar to this paper.
\n
The following papers were recommended by the Semantic Scholar API
\n
\n
Please give a thumbs up to this comment if you found it helpful!
\n
If you want recommendations for any Paper on Hugging Face checkout this Space
\n
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: \n\n@librarian-bot \n\t recommend
\n","updatedAt":"2025-03-14T01:38:48.870Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":264}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6852983832359314},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2503.07891","authors":[{"_id":"67d108c56bd6c57bab0b6f07","user":{"_id":"63fd3edc3c880680af44aa78","avatarUrl":"/avatars/98759e23e89f9da3ce13266d030e611b.svg","isPro":false,"fullname":"Jinhyuk Lee","user":"jinhyuklee","type":"user"},"name":"Jinhyuk Lee","status":"admin_assigned","statusLastChangedAt":"2025-03-12T15:42:14.774Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f08","user":{"_id":"673fd856a45b6f21829a3bf5","avatarUrl":"/avatars/deb8c5362fad22019cccaed6d03dea09.svg","isPro":false,"fullname":"Feiyang Chen","user":"PhilipChen","type":"user"},"name":"Feiyang Chen","status":"admin_assigned","statusLastChangedAt":"2025-03-12T15:42:50.034Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f09","user":{"_id":"67d198850e00700a6b9f1715","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/oBAOklXUYODJ7XxxNndqu.png","isPro":false,"fullname":"Sahil Dua","user":"sahildua2305","type":"user"},"name":"Sahil Dua","status":"claimed_verified","statusLastChangedAt":"2025-03-12T14:25:43.248Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f0a","user":{"_id":"610dd291dda0cb4dbfbf32d2","avatarUrl":"/avatars/6f3fe6d53c37076e2cefc1b4d95743d6.svg","isPro":false,"fullname":"Daniel Cer","user":"danielcer","type":"user"},"name":"Daniel Cer","status":"admin_assigned","statusLastChangedAt":"2025-03-12T15:46:21.396Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f0b","user":{"_id":"64b09a273b6d9c4ef7626b72","avatarUrl":"/avatars/a5fed033a8d241276059289318b3c49b.svg","isPro":false,"fullname":"Madhuri Shanbhogue","user":"madhuris","type":"user"},"name":"Madhuri Shanbhogue","status":"admin_assigned","statusLastChangedAt":"2025-03-12T15:46:29.742Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f0c","name":"Iftekhar Naim","hidden":false},{"_id":"67d108c56bd6c57bab0b6f0d","name":"Gustavo Hernández Ábrego","hidden":false},{"_id":"67d108c56bd6c57bab0b6f0e","name":"Zhe Li","hidden":false},{"_id":"67d108c56bd6c57bab0b6f0f","user":{"_id":"67a4fae79a07de7c65c4f516","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/67a4fae79a07de7c65c4f516/yl6gPpFDJOWShsiQJ7Kwn.jpeg","isPro":false,"fullname":"Kaifeng Chen","user":"kfrancischen","type":"user"},"name":"Kaifeng Chen","status":"claimed_verified","statusLastChangedAt":"2025-09-26T12:27:42.402Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f10","user":{"_id":"686ecc7564498736bc13db91","avatarUrl":"/avatars/ebff7401fe0b88bf99753aeb23f31081.svg","isPro":false,"fullname":"Henrique Schechter Vera","user":"hschechter","type":"user"},"name":"Henrique Schechter Vera","status":"claimed_verified","statusLastChangedAt":"2025-09-09T13:52:31.927Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f11","name":"Xiaoqi Ren","hidden":false},{"_id":"67d108c56bd6c57bab0b6f12","name":"Shanfeng Zhang","hidden":false},{"_id":"67d108c56bd6c57bab0b6f13","user":{"_id":"68d5c05805ddd80041f80776","avatarUrl":"/avatars/53a304076359f11cc92de22a2cfbec06.svg","isPro":false,"fullname":"Daniel Salz","user":"dasalz","type":"user"},"name":"Daniel Salz","status":"claimed_verified","statusLastChangedAt":"2025-09-26T12:27:48.289Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f14","user":{"_id":"655ead197b7450fedc485ce9","avatarUrl":"/avatars/fecccddf67edb87cc1971feeff556511.svg","isPro":false,"fullname":"Michael Boratko","user":"Roulette6888","type":"user"},"name":"Michael Boratko","status":"admin_assigned","statusLastChangedAt":"2025-03-12T15:47:38.164Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f15","name":"Jay Han","hidden":false},{"_id":"67d108c56bd6c57bab0b6f16","name":"Blair Chen","hidden":false},{"_id":"67d108c56bd6c57bab0b6f17","name":"Shuo Huang","hidden":false},{"_id":"67d108c56bd6c57bab0b6f18","user":{"_id":"68d5c1e09dd1fad14a71bb8e","avatarUrl":"/avatars/88c03aab9bd932eb5aa642e56048876b.svg","isPro":false,"fullname":"Vikram Rao Sudarshan","user":"raosvikram","type":"user"},"name":"Vikram Rao","status":"claimed_verified","statusLastChangedAt":"2025-09-26T12:27:45.772Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f19","name":"Paul Suganthan","hidden":false},{"_id":"67d108c56bd6c57bab0b6f1a","name":"Feng Han","hidden":false},{"_id":"67d108c56bd6c57bab0b6f1b","name":"Andreas Doumanoglou","hidden":false},{"_id":"67d108c56bd6c57bab0b6f1c","name":"Nithi Gupta","hidden":false},{"_id":"67d108c56bd6c57bab0b6f1d","user":{"_id":"66df81e4608ec2ea4ab981a7","avatarUrl":"/avatars/003fce1379a7cf1e412d328128952188.svg","isPro":false,"fullname":"Fedor Moiseev","user":"femoiseev","type":"user"},"name":"Fedor Moiseev","status":"admin_assigned","statusLastChangedAt":"2025-03-12T15:45:35.363Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f1e","name":"Cathy Yip","hidden":false},{"_id":"67d108c56bd6c57bab0b6f1f","name":"Aashi Jain","hidden":false},{"_id":"67d108c56bd6c57bab0b6f20","user":{"_id":"6560f697e0a7720b6ae377bc","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6560f697e0a7720b6ae377bc/8b-p-lQ7KV_VygBJ6Vf3_.jpeg","isPro":false,"fullname":"Simon Baumgartner","user":"sens3","type":"user"},"name":"Simon Baumgartner","status":"claimed_verified","statusLastChangedAt":"2025-05-20T19:29:55.305Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f21","name":"Shahrokh Shahi","hidden":false},{"_id":"67d108c56bd6c57bab0b6f22","name":"Frank Palma Gomez","hidden":false},{"_id":"67d108c56bd6c57bab0b6f23","name":"Sandeep Mariserla","hidden":false},{"_id":"67d108c56bd6c57bab0b6f24","user":{"_id":"68a8d0b3e5168b01f778b0af","avatarUrl":"/avatars/33381d21885cbe72e3bc8b1ae197e24d.svg","isPro":false,"fullname":"Min Choi","user":"iohcsnim","type":"user"},"name":"Min Choi","status":"claimed_verified","statusLastChangedAt":"2025-09-26T12:27:38.430Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f25","name":"Parashar Shah","hidden":false},{"_id":"67d108c56bd6c57bab0b6f26","name":"Sonam Goenka","hidden":false},{"_id":"67d108c56bd6c57bab0b6f27","name":"Ke Chen","hidden":false},{"_id":"67d108c56bd6c57bab0b6f28","name":"Ye Xia","hidden":false},{"_id":"67d108c56bd6c57bab0b6f29","name":"Koert Chen","hidden":false},{"_id":"67d108c56bd6c57bab0b6f2a","name":"Sai Meher Karthik Duddu","hidden":false},{"_id":"67d108c56bd6c57bab0b6f2b","name":"Yichang Chen","hidden":false},{"_id":"67d108c56bd6c57bab0b6f2c","name":"Trevor Walker","hidden":false},{"_id":"67d108c56bd6c57bab0b6f2d","name":"Wenlei Zhou","hidden":false},{"_id":"67d108c56bd6c57bab0b6f2e","name":"Rakesh Ghiya","hidden":false},{"_id":"67d108c56bd6c57bab0b6f2f","name":"Zach Gleicher","hidden":false},{"_id":"67d108c56bd6c57bab0b6f30","name":"Karan Gill","hidden":false},{"_id":"67d108c56bd6c57bab0b6f31","name":"Zhe Dong","hidden":false},{"_id":"67d108c56bd6c57bab0b6f32","name":"Mojtaba Seyedhosseini","hidden":false},{"_id":"67d108c56bd6c57bab0b6f33","name":"Yunhsuan Sung","hidden":false},{"_id":"67d108c56bd6c57bab0b6f34","user":{"_id":"65ac1d802f560c70ff74412d","avatarUrl":"/avatars/763b352cd671ba7ef2637db14de86951.svg","isPro":false,"fullname":"Raphael hoffmann","user":"peacemac","type":"user"},"name":"Raphael Hoffmann","status":"admin_assigned","statusLastChangedAt":"2025-03-12T15:44:50.723Z","hidden":false},{"_id":"67d108c56bd6c57bab0b6f35","user":{"_id":"631a4acfc9f8cd19a735a0ab","avatarUrl":"/avatars/61cebf77358634876d978d87248a62f3.svg","isPro":false,"fullname":"Tom Duerig","user":"tduerig","type":"user"},"name":"Tom Duerig","status":"admin_assigned","statusLastChangedAt":"2025-03-12T15:44:36.769Z","hidden":false}],"publishedAt":"2025-03-10T22:16:45.000Z","submittedOnDailyAt":"2025-03-12T02:38:58.804Z","title":"Gemini Embedding: Generalizable Embeddings from Gemini","submittedOnDailyBy":{"_id":"60f1abe7544c2adfd699860c","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674929746905-60f1abe7544c2adfd699860c.jpeg","isPro":false,"fullname":"AK","user":"akhaliq","type":"user"},"summary":"In this report, we introduce Gemini Embedding, a state-of-the-art embedding\nmodel leveraging the power of Gemini, Google's most capable large language\nmodel. Capitalizing on Gemini's inherent multilingual and code understanding\ncapabilities, Gemini Embedding produces highly generalizable embeddings for\ntext spanning numerous languages and textual modalities. The representations\ngenerated by Gemini Embedding can be precomputed and applied to a variety of\ndownstream tasks including classification, similarity, clustering, ranking, and\nretrieval. Evaluated on the Massive Multilingual Text Embedding Benchmark\n(MMTEB), which includes over one hundred tasks across 250+ languages, Gemini\nEmbedding substantially outperforms prior state-of-the-art models,\ndemonstrating considerable improvements in embedding quality. Achieving\nstate-of-the-art performance across MMTEB's multilingual, English, and code\nbenchmarks, our unified model demonstrates strong capabilities across a broad\nselection of tasks and surpasses specialized domain-specific models.","upvotes":43,"discussionId":"67d108c66bd6c57bab0b6f6e","ai_summary":"Gemini Embedding, utilizing Google's Gemini large language model, generates high-quality multilingual and code embeddings outperforming benchmarks across various tasks.","ai_keywords":["Gemini Embedding","multilingual","code understanding","embeddings","precomputed","downstream tasks","classification","similarity","clustering","ranking","retrieval","Massive Multilingual Text Embedding Benchmark (MMTEB)","state-of-the-art performance","unified model","domain-specific models"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"643be8879f5d314db2d9ed23","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/643be8879f5d314db2d9ed23/VrW2UtJ7ppOnGIYjTWd7b.png","isPro":false,"fullname":"Chen Dongping","user":"shuaishuaicdp","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"66f612b934b8ac9ffa44f084","avatarUrl":"/avatars/6836c122e19c66c90f1673f28b30d7f0.svg","isPro":false,"fullname":"Tang","user":"tommysally","type":"user"},{"_id":"63b2a92e18e5cf2cdd333492","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63b2a92e18e5cf2cdd333492/GxnngJG0u7d0jYTEFOrfe.png","isPro":false,"fullname":"Jaehyun Jun","user":"btjhjeon","type":"user"},{"_id":"655eeb5532537bcc8d7460ab","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/655eeb5532537bcc8d7460ab/gV_GfYq-GEyi1cbCTQe0r.jpeg","isPro":false,"fullname":"Yongbin Choi","user":"whybe-choi","type":"user"},{"_id":"64747f7e33192631bacd8831","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64747f7e33192631bacd8831/dstkZJ4sHJSeqLesV5cOC.jpeg","isPro":false,"fullname":"Taufiq Dwi Purnomo","user":"taufiqdp","type":"user"},{"_id":"648eb1eb59c4e5c87dc116e0","avatarUrl":"/avatars/c636cea39c2c0937f01398c94ead5dad.svg","isPro":false,"fullname":"fdsqefsgergd","user":"T-representer","type":"user"},{"_id":"6270324ebecab9e2dcf245de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6270324ebecab9e2dcf245de/cMbtWSasyNlYc9hvsEEzt.jpeg","isPro":false,"fullname":"Kye Gomez","user":"kye","type":"user"},{"_id":"64169a99bce2fed80ab86122","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1679202958868-noauth.jpeg","isPro":false,"fullname":"Sigrid Jin","user":"sigridjineth","type":"user"},{"_id":"6555125a4f361968f0e3aad7","avatarUrl":"/avatars/e7692d82804338f21ecdc6e731f5c5ea.svg","isPro":false,"fullname":"marinaretikof","user":"marinaretik","type":"user"},{"_id":"6434b6619bd5a84b5dcfa4de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6434b6619bd5a84b5dcfa4de/h8Q6kPNjFNc03wmdboHzq.jpeg","isPro":false,"fullname":"Young-Jun Lee","user":"passing2961","type":"user"},{"_id":"6317233cc92fd6fee317e030","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6317233cc92fd6fee317e030/cJHSvvimr1kqgQfHOjO5n.png","isPro":false,"fullname":"Tom Aarsen","user":"tomaarsen","type":"user"}],"acceptLanguages":["*"],"dailyPaperRank":0}">
Gemini Embedding: Generalizable Embeddings from Gemini
Published on Mar 10
·
Submitted by
AK
on Mar 12
Authors:
Iftekhar Naim
,
Gustavo Hernández Ábrego
,
Zhe Li
,
Xiaoqi Ren
,
Shanfeng Zhang
,
Jay Han
,
Blair Chen
,
Shuo Huang
,
Paul Suganthan
,
Feng Han
,
Andreas Doumanoglou
,
Nithi Gupta
+ 25 authors
Abstract
Gemini Embedding, utilizing Google's Gemini large language model, generates high-quality multilingual and code embeddings outperforming benchmarks across various tasks.
In this report, we introduce Gemini Embedding , a state-of-the-art embedding
model leveraging the power of Gemini, Google's most capable large language
model. Capitalizing on Gemini's inherent multilingual and code understanding
capabilities, Gemini Embedding produces highly generalizable embeddings for
text spanning numerous languages and textual modalities. The representations
generated by Gemini Embedding can be precomputed and applied to a variety of
downstream tasks including classification , similarity , clustering , ranking , and
retrieval . Evaluated on the Massive Multilingual Text Embedding Benchmark
(MMTEB), which includes over one hundred tasks across 250+ languages, Gemini
Embedding substantially outperforms prior state-of-the-art models,
demonstrating considerable improvements in embedding quality. Achieving
state-of-the-art performance across MMTEB's multilingual , English, and code
benchmarks, our unified model demonstrates strong capabilities across a broad
selection of tasks and surpasses specialized domain-specific models .