open-llama-7b-vi / README.md
nhanv's picture
Update README.md
4535863
|
raw
history blame
543 Bytes
metadata
license: apache-2.0
datasets:
  - vietgpt/wikipedia_vi
language:
  - vi
  - en
pipeline_tag: text-generation

Concept of open-llama-7b-vi

This is a OpenLLama model finetuned on texts in the Vietnamese language.

Model architecture

The model architecture is the same as the original OpenLLama model; 12 layers, 768 dimensions of hidden states, and 12 attention heads.

Training Data

The models are trained on the Vietnamese version of Wikipedia. The generated corpus files are 1.5GB in total, containing approximately 1M sentences.