schaeff
/

gpt2-small_LNFree300

Text Generation

text-generation-inference

Model card Files Files and versions

schaeff commited on Jul 20

Commit

31a8ffd

·

verified ·

1 Parent(s): 3dbe0a7

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -113,10 +113,9 @@ This model is part of a collection of LayerNorm-free models. The table below pro
 ## Citation
-Title: *Transformers Don’t Need LayerNorm at Inference Time: Scaling LayerNorm Removal to GPT-2 XL and the Implications for Mechanistic Interpretability*
-**BibTeX:**
 @misc{gpt2layernorm2025,
   author = {Baroni, Luca and Khara, Galvin and Schaeffer, Joachim and Subkhankulov, Marat and Heimersheim, Stefan},
   title = {Transformers Don't Need LayerNorm at Inference Time: Scaling LayerNorm Removal to GPT-2 XL and the Implications for Mechanistic Interpretability},
@@ -126,3 +125,4 @@ Title: *Transformers Don’t Need LayerNorm at Inference Time: Scaling LayerNorm
   primaryClass = {cs.LG},
   url = {https://arxiv.org/abs/2507.02559v1}
 }

 ## Citation
+If you have found our work useful please cite as:
+```
 @misc{gpt2layernorm2025,
   author = {Baroni, Luca and Khara, Galvin and Schaeffer, Joachim and Subkhankulov, Marat and Heimersheim, Stefan},
   title = {Transformers Don't Need LayerNorm at Inference Time: Scaling LayerNorm Removal to GPT-2 XL and the Implications for Mechanistic Interpretability},
   primaryClass = {cs.LG},
   url = {https://arxiv.org/abs/2507.02559v1}
 }
+```