AI Tools
Finetuning Multimodal Embeddings with Sentence Transformers: Real Gains or Just Another Benchmark Win?
I've seen a thousand 'breakthrough' model tweaks in 20 years, but this finetune of Qwen's multimodal embedder actually delivers: 0.947 NDCG on VDR, smoking rivals four times its size. Still, who's cashing in?