--- language: - ckb tags: - generated_from_trainer datasets: - PawanKrd/asr-ckb metrics: - wer model-index: - name: ASR CKB results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: PawanKrd/asr-ckb type: PawanKrd/asr-ckb metrics: - name: Wer type: wer value: 4.1303699778079555 --- # Automatic Speech Recognition - CKB This model is trained on the [PawanKrd/asr-ckb](https://huggingface.co/datasets/PawanKrd/asr-ckb) dataset. This model is specifically for the Central Kurdish (Sorani) language. ## Model Performance The model achieves the following performance on the evaluation set: - **Loss**: 0.0048 - **Word Error Rate (WER)**: 4.1304 ## Model Description This Automatic Speech Recognition (ASR) model for Central Kurdish (Sorani) is designed to transcribe spoken Kurdish into written text. It leverages a deep learning architecture optimized for speech-to-text tasks. The model is built using the Transformers library and trained on a diverse set of Central Kurdish audio recordings. ## Intended Uses & Limitations This model is intended for automatic transcription of Central Kurdish audio. It performs best on clear, high-quality audio recordings. Performance may degrade with noisy backgrounds, strong accents, or atypical pronunciations. ### Intended Uses - Transcribing interviews and speeches in Central Kurdish. - Creating subtitles for Kurdish videos. - Assisting in the documentation and preservation of the Kurdish language. ### Limitations - Performance may be suboptimal on audio with heavy background noise. - Strong regional accents or non-standard pronunciations can impact accuracy. - Not suitable for real-time transcription without further optimization. ## Training and Evaluation Data The model was trained and evaluated using the [PawanKrd/asr-ckb](https://huggingface.co/datasets/PawanKrd/asr-ckb) dataset, which consists of diverse audio samples in Central Kurdish. The training process was designed to optimize the model's recognition accuracy for this specific language. ## Training Procedure ### Hyperparameters - **Learning Rate**: 1e-05 - **Train Batch Size**: 32 - **Eval Batch Size**: 16 - **Seed**: 42 - **Optimizer**: Adam (betas=(0.9, 0.999), epsilon=1e-08) - **Learning Rate Scheduler**: Linear - **Warmup Steps**: 500 - **Epochs**: 3 ### Training Results | Training Loss | Epoch | Step | Validation Loss | WER | |:-------------:|:------:|:-----:|:---------------:|:-------:| | 0.0966 | 0.1927 | 1000 | 0.1457 | 29.30 | | 0.0952 | 0.3854 | 2000 | 0.0988 | 22.26 | | 0.0582 | 0.5780 | 3000 | 0.0741 | 17.51 | | 0.0523 | 0.7707 | 4000 | 0.0532 | 15.14 | | 0.0164 | 0.9634 | 5000 | 0.0412 | 14.19 | | 0.0271 | 1.1561 | 6000 | 0.0519 | 15.68 | | 0.0358 | 1.3487 | 7000 | 0.0407 | 11.18 | | 0.0208 | 1.5414 | 8000 | 0.0327 | 9.94 | | 0.031 | 1.7341 | 9000 | 0.0268 | 10.86 | | 0.033 | 1.9268 | 10000 | 0.0191 | 7.70 | | 0.0269 | 2.1195 | 11000 | 0.0138 | 6.48 | | 0.025 | 2.3121 | 12000 | 0.0111 | 6.83 | | 0.003 | 2.5048 | 13000 | 0.0086 | 5.78 | | 0.0021 | 2.6975 | 14000 | 0.0065 | 4.66 | | 0.0031 | 2.8902 | 15000 | 0.0048 | 4.13 | ### Framework Versions - **Transformers**: 4.41.0.dev0 - **PyTorch**: 2.3.0+cu121 - **Datasets**: 2.19.1 - **Tokenizers**: 0.19.1 ## Example Usage To use this model for transcription, you can follow the example code below: ```python from transformers import pipeline # Load the fine-tuned model asr_pipeline = pipeline(model="PawanKrd/asr-large-ckb") # Transcribe audio file audio_file = "audio.wav" transcription = asr_pipeline(audio_file) # Print the transcription print(transcription["text"]) ``` This code demonstrates how to load the model and use it to transcribe an audio file in Central Kurdish.