Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Smart. When they do come, will the embedding vectors be OpenAI compatible? I assume this is quite hard to do.


Embeddings as an I/O schema are just text-in, a list of numbers out. There are very few embedding models which require enough preprocessing to warrant an abstraction. (A soft example is the new nomic-embed-text-v1, which requires adding prefix annotations: https://huggingface.co/nomic-ai/nomic-embed-text-v1 )


Yes of course (syntactically it is just float[] getEmbeddings(text)) but are the numbers close to what OpenAI would produce? I assume no.


This submission only about I/O schema: the embeddings themselves are dependent on the model, and since OpenAI's models are closed source no one can reproduce them.

No direct embedding model can be cross-compatable. (exception: constrastive learning models like CLIP)


Probably not, embedding vectors aren't conpatible across different embedding models, and other tools presenting OAI-compatible APIs don't use OAI-compatible embedding models (e.g., oobabooga lets you configure different embeddings models, but none of them produce compatible vectors to the OAI ones.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: