Word2Vec Cracked: It Learns PCA on a Clever Co-Occurrence Matrix
Word2Vec doesn't conjure magic vectors. It crunches co-occurrences into PCA eigenvectors, one rank at a time. This new theory finally explains the black box.
⚡ Key Takeaways
- Word2Vec training reduces to online PCA on a co-occurrence matrix M-star.
- Learns in discrete rank-incrementing steps from small initializations.
- Features are top eigenvectors encoding interpretable concepts like celebrities or geography.
🧠 What's your take on this?
Cast your vote and see what theAIcatchup readers think
Worth sharing?
Get the best AI stories of the week in your inbox — no noise, no spam.
Originally reported by Berkeley AI Research