File size: 3,006 Bytes
4f90ff5 956b8ae 4f90ff5 956b8ae 4f90ff5 956b8ae 4f90ff5 956b8ae 4f90ff5 956b8ae 4f90ff5 956b8ae 4f90ff5 4426a4c 4f90ff5 4426a4c 4f90ff5 956b8ae 4f90ff5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
---
language:
- ko
license: apache-2.0
base_model: openai/whisper-base
tags:
- hf-asr-leaderboard
- generated_from_trainer
datasets:
- INo0121/low_quality_call_voice
model-index:
- name: Whisper Base for Korean Low quaiity Call Voices
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# Whisper Base for Korean Low quaiity Call Voices
This model is a fine-tuned version of [openai/whisper-base](https://huggingface.co/openai/whisper-base) on the Korean Low Quaiity Call Voices dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4941
- Cer: 30.7538
## Model description
ํ๋ก์ ํธ ์ฉ๋๋ก ํ์ธํ๋๋ ๋ชจ๋ธ์
๋๋ค.
OpenAI์ Whisper-Base ๋ชจ๋ธ์ ๋ฐํ์ผ๋ก 'ํ๊ตญ์ด ์ ์์ง ์์ฑ ํตํ ๋ฐ์ดํฐ'์ ๋ํ ์ ํ๋๋ฅผ ์ฆ๊ฐ์ํค๊ณ ์ ํ์ธํ๋์ ์งํํ ๋ชจ๋ธ์ด๋ฉฐ,
์ฌ์ฉํ ๋ฐ์ดํฐ๋ AI-HUB์ โ์ ์์ง ์ ํ๋ง ์์ฑ์ธ์ ๋ฐ์ดํฐโ ์ค ์ผ๋ถ๋ก์ ์ค๋์ค ํ์ผ ๊ธฐ์ค 240,771.06์ด(ํ์ผ 1๊ฐ๋น ํ๊ท ๊ธธ์ด๋ ์ฝ 5.296์ด)
ํ
์คํธ ๋ฐ์ดํฐ ๊ธฐ์ค ์ด 1,696,414๊ธ์์ ํฌ๊ธฐ์
๋๋ค.
This is a fine-tuned model for project use.
This model was fine-tuned to increase the accuracy of โKorean low-quality voice call dataโ based on OpenAIโs Whisper-Base model.
The data used is part of AI-HUBโs โlow-quality telephone network voice recognition dataโ,
which is 240,771.06 seconds based on audio files(average length per file is about 5.296 seconds).
The total size is 1,696,414 characters based on text data.
## Intended uses & limitations
ํ์ธํ๋์ ์ฌ์ฉ๋ Base model๊ณผ dataset ๋ชจ๋ ํ์ต ๋ชฉ์ ์ผ๋ก ์ฌ์ฉํ์์ผ๋ฉฐ,
๋ฐ๋ผ์ ๋ณธ ๋ชจ๋ธ ์ญ์ ํ์ต ๋ชฉ์ ์ผ๋ก๋ง ์ฌ์ฉ ๊ฐ๋ฅํฉ๋๋ค.
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 8000
### Training results
| Training Loss | Epoch | Step | Validation Loss | Cer |
|:-------------:|:-----:|:----:|:---------------:|:-------:|
| 0.6416 | 0.44 | 1000 | 0.6564 | 64.1489 |
| 0.5914 | 0.88 | 2000 | 0.5688 | 37.4957 |
| 0.435 | 1.32 | 3000 | 0.5349 | 32.6734 |
| 0.4056 | 1.76 | 4000 | 0.5124 | 30.9065 |
| 0.3368 | 2.2 | 5000 | 0.5057 | 32.6925 |
| 0.3107 | 2.64 | 6000 | 0.4979 | 32.8315 |
| 0.3016 | 3.08 | 7000 | 0.4947 | 29.3060 |
| 0.2979 | 3.52 | 8000 | 0.4941 | 30.7538 |
### Framework versions
- Transformers 4.34.0.dev0
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.13.3
|