Transformers at Training vs Inference

Transformers at Training vs Inference #