|
--- |
|
base_model: |
|
- Ultralytics/YOLOv8 |
|
pipeline_tag: image-segmentation |
|
license: agpl-3.0 |
|
--- |
|
|
|
## Text line detection from Finnish 19th century Court Records |
|
|
|
The model is trained to find text lines from digitized 19th century court record documents. |
|
The model has been trained using yolov8x-seg by Ultralytics as the base model. |
|
|
|
|
|
## Intended uses & limitations |
|
|
|
<img src='text_line_example.jpg' width='500'> |
|
|
|
Most of the training data consist of handwritten documents, but the model appears to generalize quite well also to typeset data. |
|
|
|
## Training data |
|
|
|
Training dataset consisted of 4615 digitized and annotated 19th century court record documents, while validation |
|
dataset contained 574 annotated document images. |
|
|
|
## Training procedure |
|
|
|
This model was trained using 2 NVIDIA RTX A6000 GPUs with the following hyperparameters: |
|
|
|
- image size: 640 |
|
- learning rate (lr0): 0.05 |
|
- train batch size: 32 |
|
- epochs: 100 |
|
- patience: 10 epochs |
|
- optimizer: SGD |
|
- scheduler: cosine learning rate scheduler (cos_lr=True) |
|
- workers: 4 |
|
|
|
Default settings were used for other training hyperparameters (find more information [here](https://docs.ultralytics.com/modes/train/#train-settings)). |
|
|
|
Model training was performed using the following code: |
|
|
|
```python |
|
from ultralytics import YOLO |
|
|
|
# Use pretrained Yolo segmentation model |
|
model = YOLO('yolov8x-seg.pt') |
|
|
|
# Path to .yaml file where data location and object classes are defined |
|
yaml_path = 'text_lines.yaml' |
|
|
|
# Start model training with the defined parameters |
|
model.train(data=yaml_path, name='model_name', epochs=100, imgsz=640, workers=4, optimizer='SGD', lr0=0.05, seed=551, val=True, cos_lr=True, patience=10, batch=32, device=[0,1]) |
|
``` |
|
|
|
## Evaluation results |
|
|
|
Evaluation results using the validation dataset are listed below: |
|
|Class|Images|Class instances|Box precision|Box recall|Box mAP50|Box mAP50-95|Mask precision|Mask recall|Mask mAP50|Mask mAP50-95 |
|
|:----|:----|:----|:----|:----|:----|:----|:----|:----|:----|:----| |
|
Text line|574|43156|0.912|0.888|0.949|0.701|0.935|0.907|0.954|0.55 |
|
|
|
More information on the performance metrics can be found [here](https://docs.ultralytics.com/guides/yolo-performance-metrics/). |
|
|
|
## Inference |
|
|
|
If the model file `tuomiokirja_lines_05122023.pt` is downloaded to a folder `\models\tuomiokirja_lines_05122023.pt` |
|
and the input image path is `\data\image.jpg', inference can be perfomed using the following code: |
|
|
|
```python |
|
from ultralytics import YOLO |
|
|
|
# Initialize model |
|
model = YOLO('\models\tuomiokirja_lines_05122023.pt') |
|
prediction_results = model.predict(source='\data\image.jpg', save=True) |
|
``` |
|
More information for available inference arguments can be found [here](https://docs.ultralytics.com/modes/predict/#inference-arguments). |