File size: 2,279 Bytes
dfd8fb0 e9d80ba 296663d dfd8fb0 c987408 dfd8fb0 e9d80ba dfd8fb0 e9d80ba dfd8fb0 e9d80ba dfd8fb0 e9d80ba dfd8fb0 e9d80ba dfd8fb0 e9d80ba dfd8fb0 e9d80ba dfd8fb0 e9d80ba dfd8fb0 e9d80ba c987408 dfd8fb0 e9d80ba dfd8fb0 e9d80ba 296663d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
---
library_name: transformers
tags:
- roman eng2nep
- translation
- transliteration
license: mit
datasets:
- syubraj/roman2nepali-transliteration
language:
- en
- ne
base_model:
- google-t5/t5-base
pipeline_tag: translation
new_version: syubraj/RomanEng2Nep-v2
---
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- **Model type:** [More Information Needed]
- **Language(s) (NLP):** [Roman Eng, Nep]
- **License:** [MIT]
- **Finetuned from model [google-t5/t5-small]:**
<!-- Provide the basic links for the model. -->
## How to Get Started with the Model
Use the code below to get started with the model.
```Python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
# Load your fine-tuned model and tokenizer
model_name = 'syubraj/romaneng2nep'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
# Set max sequence length
max_seq_len = 30
def translate(text):
# Tokenize the input text with a max length of 30
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len)
# Generate translation
translated = model.generate(**inputs)
# Decode the translated tokens back to text
translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
return translated_text
# Example usage
source_text = "timilai kasto cha?" # Example Romanized Nepali text
translated_text = translate(source_text)
print(f"Translated Text: {translated_text}")
```
## Training Details
```Python
training_args = Seq2SeqTrainingArguments(
output_dir="/kaggle/working/romaneng2nep/",
eval_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
weight_decay=0.01,
save_total_limit=3,
num_train_epochs=3,
predict_with_generate=True,
fp16=True,
)
```
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
[syubraj/roman2nepali-transliteration](https://huggingface.co/datasets/syubraj/roman2nepali-transliteration) |