Edit model card

Model Description

This model, named traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko, is a machine translation model that translates English to Korean. It is fine-tuned from the KETI-AIR/ke-t5-base model using the aihub-koen-translation-integrated-base-10m dataset.

Model Architecture

The model uses the ke-t5-base architecture, which is based on the T5 (Text-to-Text Transfer Transformer) model.

Training Data

The model was trained on the aihub-koen-translation-integrated-base-10m dataset, which is designed for English-to-Korean translation tasks.

Training Procedure

Training Parameters

The model was trained with the following parameters:

  • Learning Rate: 0.0005
  • Weight Decay: 0.01
  • Batch Size: 64 (training), 128 (evaluation)
  • Number of Epochs: 2
  • Save Steps: 500
  • Max Save Checkpoints: 2
  • Evaluation Strategy: At the end of each epoch
  • Logging Strategy: No logging
  • Use of FP16: No
  • Gradient Accumulation Steps: 2
  • Reporting: None

Hardware

The training was performed on a single GPU system with an NVIDIA A100 (40GB).

Performance

The model achieved the following BLEU scores during training:

  • Epoch 1: 18.006119
  • Epoch 2: 18.838066

Usage

This model is suitable for applications involving translation from English to Korean. Here is an example on how to use this model in Hugging Face's Transformers:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model = AutoModelForSeq2SeqLM.from_pretrained("traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko")
tokenizer = AutoTokenizer.from_pretrained("traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko")

inputs = tokenizer.encode("This is a sample text.", return_tensors="pt")
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
400
Safetensors
Model size
247M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results

  • BLEU on AIHub KO-EN Translation Integrated Base (10M)
    self-reported
    18.838
  • BLEU on AIHub KO-EN Translation Integrated Base (10M)
    self-reported
    18.006