adriszmar
/

whisper-large-v3-turbo-es

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

adriszmar commited on 19 days ago

Commit

b32b704

•

1 Parent(s): 461bee2

Update

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -19,9 +19,9 @@ should probably proofread and complete it, then remove this comment. -->
 # Whisper Large V3 Turbo - Spanish
-This model is a fine-tuned version of [openai/whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo) on the Common Voice 17.0 dataset.
-The fine-tuning process reduced the Word Error Rate (WER) from 10.18% to 2.69%, emonstrating significant improvement in transcription accuracy for spanish audios.
 ## Model description
@@ -35,8 +35,8 @@ More information needed
 The model was trained using the Common Voice 17.0 dataset - spanish subset (mozilla-foundation/common_voice_17_0). Both the base model, whisper-large-v3-turbo, and the fine-tuned model, whisper-large-v3-turbo-es, were evaluated using Word Error Rate (WER) on the test split of the same dataset. The results are as follows:
-- WER for whisper-large-v3-turbo (base): 10.18%
-- WER for whisper-large-v3-turbo-es (fine-tuned): 2.69%
 This significant reduction in WER shows that fine-tuning the model for spanish audio led to improved transcription accuracy compared to the original base model.

 # Whisper Large V3 Turbo - Spanish
+This model is a fine-tuned version of [openai/whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo) on the Common Voice 17.0 dataset - spanish subset.
+The fine-tuning process reduced the Word Error Rate (WER) from 6.91% to 5.34%, demonstrating significant improvement in transcription accuracy for spanish audios.
 ## Model description
 The model was trained using the Common Voice 17.0 dataset - spanish subset (mozilla-foundation/common_voice_17_0). Both the base model, whisper-large-v3-turbo, and the fine-tuned model, whisper-large-v3-turbo-es, were evaluated using Word Error Rate (WER) on the test split of the same dataset. The results are as follows:
+- WER for whisper-large-v3-turbo (base): 6.91%
+- WER for whisper-large-v3-turbo-es (fine-tuned): 5.34%
 This significant reduction in WER shows that fine-tuning the model for spanish audio led to improved transcription accuracy compared to the original base model.