metadata
library_name: transformers
tags:
- roman eng2nep
- translation
- transliteration
license: mit
datasets:
- syubraj/roman2nepali-transliteration
language:
- en
- ne
base_model:
- google-t5/t5-base
pipeline_tag: translation
new_version: syubraj/RomanEng2Nep-v2
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Model type: [More Information Needed]
- Language(s) (NLP): [Roman Eng, Nep]
- License: [MIT]
- Finetuned from model [google-t5/t5-small]:
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
# Load your fine-tuned model and tokenizer
model_name = 'syubraj/romaneng2nep'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
# Set max sequence length
max_seq_len = 30
def translate(text):
# Tokenize the input text with a max length of 30
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len)
# Generate translation
translated = model.generate(**inputs)
# Decode the translated tokens back to text
translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
return translated_text
# Example usage
source_text = "timilai kasto cha?" # Example Romanized Nepali text
translated_text = translate(source_text)
print(f"Translated Text: {translated_text}")
Training Details
training_args = Seq2SeqTrainingArguments(
output_dir="/kaggle/working/romaneng2nep/",
eval_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
weight_decay=0.01,
save_total_limit=3,
num_train_epochs=3,
predict_with_generate=True,
fp16=True,
)