AI Hardware
BERT's Bidirectionality: Transformer Hype or Training Trick?
BERT exploded onto NLP in 2018, leaping GLUE scores by 7.7 points. But its 'bidirectional' brag? Mostly a clever training hack on old Transformer bones.