metadata
license: apache-2.0
datasets:
- vietgpt/wikipedia_vi
language:
- vi
- en
pipeline_tag: text-generation
Concept of open-llama-7b-vi
This is a OpenLLama model finetuned on texts in the Vietnamese language.
Model architecture
The model architecture is the same as the original OpenLLama model; 12 layers, 768 dimensions of hidden states, and 12 attention heads.
Training Data
The models are trained on the Vietnamese version of Wikipedia. The generated corpus files are 1.5GB in total, containing approximately 1M sentences.