TAIDE-7B-Thai-Pretrain-LoRA
Model Description
TAIDE-7B-Thai-Pretrain-LoRA is a machine translation model designed to translate between Traditional Chinese and Thai. The model has 7 billion parameters and is built upon the Trustworthy AI Dialogue Engine by Taiwan (TAIDE). It employs a two-stage fine-tuning process to enhance its proficiency in Thai and align its translation capabilities with the nuances of both Traditional Chinese and Thai languages.
Training Methodology
Utilized the Advanced Language Model-based Translator (ALMA) strategy developed by Xu et al. (2024):
- Initial Pre-Training Stage:
- Objective: To build a robust foundational understanding of the Thai language.
- Method: Continue pre-train on a comprehensive dataset of one million Thai instances.
- Fine-Tuning Stage:
- Objective: To align the model's translation capabilities with the specific nuances of both languages.
- Method: Fine-tuned on a smaller set of high-quality Traditional Chinese-Thai parallel data.
A quick start to use our model.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Wilailack/TAIDE-7B-Thai-Pretrain-LoRA", use_fast=True)
model = AutoModelForCausalLM.from_pretrained("Wilailack/TAIDE-7B-Thai-Pretrain-LoRA", torch_dtype=torch.bfloat16, device_map="auto")
# Add the source setence into the prompt template
prompt="Translate this from Chinese to Thai:\nChinese: 我最愛的就是你!\nThai:"
input_ids = tokenizer(prompt, return_tensors="pt", padding=True, max_length=200, truncation=True).input_ids
# Translation
with torch.no_grad():
generated_ids = model.generate(input_ids=input_ids, num_beams=5, max_new_tokens=200, do_sample=True, temperature=0.6, top_p=0.9)
outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print(outputs)
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.