NeMo
PyTorch
53 languages
text generation
causal-lm
MaximumEntropy commited on
Commit
a5f6bad
1 Parent(s): e428a5b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -85,7 +85,7 @@ This model was trained on 1.1T tokens with [NeMo](https://docs.nvidia.com/deeple
85
  - Maximum sequence length of 4,096 compared to 2,048 in https://huggingface.co/nvidia/nemo-megatron-gpt-20B.
86
  - No dropout.
87
  - No bias terms in all linear layers.
88
- - United embedding and output layers.
89
 
90
  ## Getting started
91
 
 
85
  - Maximum sequence length of 4,096 compared to 2,048 in https://huggingface.co/nvidia/nemo-megatron-gpt-20B.
86
  - No dropout.
87
  - No bias terms in all linear layers.
88
+ - Untied embedding and output layers.
89
 
90
  ## Getting started
91