0

What I want to do is a CLI program that outputs embeddings of an arbitrary input. To do that, I want to do an inference with an embeddings model, and I chose NV-Embed-v2. My framework of choice is Candle, but I also looked at Mistral-RS.

Basically, what I'm trying to do is this code fragment: https://huggingface.co/nvidia/NV-Embed-v2 but with Rust and Candle.

What I tried is to start off with Mistral Candle's example because the NV-Embed's HF page says: Model Details / Base Decoder-only LLM: Mistral-7B-v0.1.

I replaced the model id in the original code with nvidia/NV-Embed-v2, and was able to download the weights from Hugging Face, but upon loading the config, I got this:

Error: missing field `vocab_size` at line 101 column 1

Then I hardcoded the values from the JSON config loaded from HF to a newly created candle_transformers::models::mistral::Config instance. And after that, Mistral::new(&config, vb) fails with:

Error: cannot find tensor model.embed_tokens.weight

Is there a way around that — maybe there are some other Candle-based open source works that I could use as an inspiration? Or, maybe that's a common mistake that could easily be diagnosed?

1 Answer 1

2
+100

candle looking for model.embed_tokens.weight whereas the original tensor name is embedding_model.embed_tokens.weight. You just have to change this line of mistral.rs in candle_transformers.

// from
let vb_m = vb.pp("model");
//to
let vb_m = vb.pp("embedding_model");

Not the answer you're looking for? Browse other questions tagged or ask your own question.