metadata

license: apache-2.0
datasets:
  - vietgpt/wikipedia_vi
language:
  - vi
  - en
pipeline_tag: text-generation

Concept of open-llama-7b-vi

This is a OpenLLama model finetuned on texts in the Vietnamese language.

Model architecture

The model architecture is the same as the original OpenLLama model; 12 layers, 768 dimensions of hidden states, and 12 attention heads.

The models are trained on the Vietnamese version of Wikipedia. The generated corpus files are 1.5GB in total, containing approximately 1M sentences.