Whisper Turbo SQ
This model is a fine-tuned version of openai/whisper-large-v3-turbo on the Common Voice 19.0 dataset. It achieves the following results on the evaluation set:
- Loss: 0.0880
- Wer: 7.0282
Model description
You can read more about the model in the openai/whisper-large-v3-turbo model card.
Performance and Limitations
The Whisper Large V3 Turbo SQ model demonstrates improved performance compared to the pretrained version, with a WER of 7.02. However, the results are not yet optimal. The main challenge is the lack of sufficient and high-quality data for the Albanian language. This model serves as an example to highlight that increased community participation and voice donations can significantly enhance performance. To achieve top-tier results, a larger and more diverse dataset is essential. Contributions from the community are crucial for improving the model's accuracy and efficiency. You can contribute to this effort by visiting the Mozilla Common Voice website.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 4
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
0.3481 | 0.4237 | 250 | 0.3336 | 28.2712 |
0.2877 | 0.8475 | 500 | 0.2711 | 26.1243 |
0.1694 | 1.2712 | 750 | 0.2183 | 19.8079 |
0.1405 | 1.6949 | 1000 | 0.1722 | 15.8757 |
0.0623 | 2.1186 | 1250 | 0.1430 | 13.7514 |
0.0594 | 2.5424 | 1500 | 0.1238 | 12.4633 |
0.0379 | 2.9661 | 1750 | 0.1054 | 9.6949 |
0.017 | 3.3898 | 2000 | 0.0968 | 8.3051 |
0.0133 | 3.8136 | 2250 | 0.0880 | 7.0282 |
Framework versions
- Transformers 4.45.1
- Pytorch 2.4.0+cu121
- Datasets 3.0.1
- Tokenizers 0.20.0
- Downloads last month
- 31
Model tree for Kushtrim/whisper-large-v3-turbo-sq-test1
Base model
openai/whisper-large-v3