BEE-spoke-data/smol_llama-101M-GQA
Text Generation
•
Updated
•
2.2k
•
26
small-scale pretraining experiments of mine
Note this is a mid-training checkpoint of what is now smol_llama-220M