Microsoft's Harrier-OSS-v1: Decoder Magic Reshapes Multilingual Embeddings
Picture embedding a 30-page PDF in one go, across languages, without losing the plot. Microsoft's new Harrier models just made that real, ditching old encoders for LLM-style decoders.
⚡ Key Takeaways
- Decoder-only shift from BERT with 32k contexts crushes chunking pains.
- Instruction-tuned for task-flexible vectors, SOTA on Multilingual MTEB v2.
- Scalable family (270M to 27B) via distillation—efficient power for all.
🧠 What's your take on this?
Cast your vote and see what theAIcatchup readers think
Worth sharing?
Get the best AI stories of the week in your inbox — no noise, no spam.
Originally reported by MarkTechPost