Yifei Li commited on
Commit
9950107
1 Parent(s): 2c1feb3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ ### Model Description
2
+ GPT-J 6B is a transformer model designed using EleutherAI's replication of the GPT-3 architecture. GPT-J refers to the class of models, while 6B represents the number of parameters of this particular pre-trained model.
3
+
4
+ The original GPT-J-6B model is trained with TPUs, which is not easy to use for normal users. Thus, through a converting script, we convert the TPU version GPT-J-6B into GPU version, which could be load and fine-tuned with GPUs.