kenhktsui's picture
Update README.md
76345ce verified
|
raw
history blame
No virus
14.5 kB
---
license: mit
base_model: FacebookAI/xlm-roberta-base
tags:
- generated_from_trainer
metrics:
- accuracy
- precision
- recall
- f1
model-index:
- name: llm-data-textbook-quality-classifer-v1
results: []
datasets:
- kenhktsui/llm-data-quality-tokenized
language:
- en
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# llm-data-textbook-quality-classifer-v1
This model can classify if a text is of textbook quality data. It can be used as a filter for data curation when training a LLM.
Please note textbook quality is a subset of high quality.
## Benchmark
![image/png](https://cdn-uploads.huggingface.co/production/uploads/60e50ce5350d181892d5a636/US04uiMXJpFLmoG-q7mvZ.png)
|Dataset | Sampling | Average Quality Score |
|--------------------------------------|---|-------------------|
|[nampdn-ai/tiny-textbooks](https://huggingface.co/datasets/nampdn-ai/tiny-textbooks) |First 10,000| 0.8618|
|[nampdn-ai/tiny-orca-textbooks](https://huggingface.co/datasets/nampdn-ai/tiny-orca-textbooks) |First 10,000| 0.8544|
|[SciPhi/textbooks-are-all-you-need-lite](https://huggingface.co/datasets/SciPhi/textbooks-are-all-you-need-lite) |First 10,000| 0.8109|
|[pszemraj/simple_wikipedia_LM](https://huggingface.co/datasets/pszemraj/simple_wikipedia_LM) | Full| 0.5386|
|[mattymchen/refinedweb-3m](https://huggingface.co/datasets/mattymchen/refinedweb-3m)| Full| 0.2951|
|[JeanKaddour/minipile](https://huggingface.co/datasets/JeanKaddour/minipile)| Full | 0.2618|
The classifier aligns with the expectation. Textbook category scores the highest, reflecting the effectiveness of this model. Wikipedia scores lower because it is not textbook after all. Web scores the lowest.
This model is a fine-tuned version of [FacebookAI/xlm-roberta-base](https://huggingface.co/FacebookAI/xlm-roberta-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2689
- Accuracy: 0.8833
- Precision: 0.7551
- Recall: 0.7598
- F1: 0.7574
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1
### Training results
| Training Loss | Epoch | Step | Accuracy | F1 | Validation Loss | Precision | Recall |
|:-------------:|:-----:|:-----:|:--------:|:------:|:---------------:|:---------:|:------:|
| 0.4745 | 0.01 | 500 | 0.8076 | 0.6181 | 0.4327 | 0.5898 | 0.6493 |
| 0.4088 | 0.02 | 1000 | 0.8346 | 0.5522 | 0.4287 | 0.7870 | 0.4254 |
| 0.3811 | 0.02 | 1500 | 0.8286 | 0.6651 | 0.3741 | 0.6257 | 0.7098 |
| 0.3762 | 0.03 | 2000 | 0.85 | 0.6529 | 0.3413 | 0.7334 | 0.5884 |
| 0.3647 | 0.04 | 2500 | 0.8427 | 0.6632 | 0.3852 | 0.6815 | 0.6460 |
| 0.3495 | 0.05 | 3000 | 0.8629 | 0.6987 | 0.3253 | 0.7385 | 0.6631 |
| 0.3508 | 0.06 | 3500 | 0.8335 | 0.6967 | 0.3605 | 0.6186 | 0.7973 |
| 0.3342 | 0.06 | 4000 | 0.8553 | 0.7075 | 0.3273 | 0.6865 | 0.7298 |
| 0.341 | 0.07 | 4500 | 0.8602 | 0.6679 | 0.3320 | 0.7759 | 0.5863 |
| 0.3344 | 0.08 | 5000 | 0.8531 | 0.6916 | 0.3441 | 0.6964 | 0.6868 |
| 0.3341 | 0.09 | 5500 | 0.8536 | 0.7027 | 0.3265 | 0.6849 | 0.7214 |
| 0.3319 | 0.1 | 6000 | 0.8599 | 0.7081 | 0.3266 | 0.7076 | 0.7085 |
| 0.3259 | 0.1 | 6500 | 0.8136 | 0.6907 | 0.3908 | 0.5736 | 0.8678 |
| 0.3391 | 0.11 | 7000 | 0.8642 | 0.6770 | 0.3338 | 0.7879 | 0.5934 |
| 0.3207 | 0.12 | 7500 | 0.8668 | 0.7224 | 0.3035 | 0.7221 | 0.7227 |
| 0.3191 | 0.13 | 8000 | 0.8543 | 0.7153 | 0.3179 | 0.6730 | 0.7631 |
| 0.3142 | 0.14 | 8500 | 0.8679 | 0.7052 | 0.3101 | 0.7585 | 0.6589 |
| 0.3195 | 0.14 | 9000 | 0.8636 | 0.7254 | 0.3433 | 0.7012 | 0.7515 |
| 0.3196 | 0.15 | 9500 | 0.8707 | 0.7191 | 0.3048 | 0.7506 | 0.6902 |
| 0.3176 | 0.16 | 10000 | 0.8597 | 0.7271 | 0.3177 | 0.6814 | 0.7794 |
| 0.3218 | 0.17 | 10500 | 0.8723 | 0.6993 | 0.3212 | 0.8031 | 0.6193 |
| 0.3175 | 0.18 | 11000 | 0.8601 | 0.7239 | 0.3366 | 0.6871 | 0.7648 |
| 0.3296 | 0.18 | 11500 | 0.8526 | 0.7190 | 0.3218 | 0.6622 | 0.7865 |
| 0.3249 | 0.19 | 12000 | 0.8731 | 0.7081 | 0.2926 | 0.7896 | 0.6418 |
| 0.3141 | 0.2 | 12500 | 0.8741 | 0.7215 | 0.3035 | 0.7683 | 0.6802 |
| 0.3126 | 0.21 | 13000 | 0.8659 | 0.7231 | 0.3127 | 0.7162 | 0.7302 |
| 0.3204 | 0.22 | 13500 | 0.8665 | 0.7233 | 0.3456 | 0.7190 | 0.7277 |
| 0.3108 | 0.22 | 14000 | 0.8674 | 0.7214 | 0.3018 | 0.7269 | 0.7160 |
| 0.3114 | 0.23 | 14500 | 0.8726 | 0.7016 | 0.2967 | 0.8002 | 0.6247 |
| 0.3071 | 0.24 | 15000 | 0.8768 | 0.7211 | 0.2904 | 0.7886 | 0.6643 |
| 0.2965 | 0.25 | 15500 | 0.8674 | 0.7310 | 0.3126 | 0.7117 | 0.7515 |
| 0.3022 | 0.26 | 16000 | 0.8738 | 0.7077 | 0.2887 | 0.7958 | 0.6372 |
| 0.3101 | 0.26 | 16500 | 0.8559 | 0.7251 | 0.3312 | 0.6683 | 0.7923 |
| 0.3154 | 0.27 | 17000 | 0.8575 | 0.7304 | 0.3221 | 0.6685 | 0.8048 |
| 0.3041 | 0.28 | 17500 | 0.8754 | 0.7248 | 0.2864 | 0.7704 | 0.6843 |
| 0.3093 | 0.29 | 18000 | 0.8603 | 0.7292 | 0.3101 | 0.6813 | 0.7844 |
| 0.3006 | 0.3 | 18500 | 0.8753 | 0.7111 | 0.3008 | 0.7999 | 0.6401 |
| 0.3108 | 0.3 | 19000 | 0.8689 | 0.7316 | 0.2911 | 0.7185 | 0.7452 |
| 0.3071 | 0.31 | 19500 | 0.8793 | 0.7366 | 0.2839 | 0.7725 | 0.7039 |
| 0.3002 | 0.32 | 20000 | 0.852 | 0.7239 | 0.3391 | 0.6550 | 0.8090 |
| 0.301 | 0.33 | 20500 | 0.8769 | 0.7396 | 0.2896 | 0.7505 | 0.7289 |
| 0.3075 | 0.34 | 21000 | 0.8785 | 0.7402 | 0.2891 | 0.7595 | 0.7219 |
| 0.2922 | 0.34 | 21500 | 0.8393 | 0.7164 | 0.4094 | 0.6210 | 0.8465 |
| 0.2973 | 0.35 | 22000 | 0.8787 | 0.7416 | 0.2962 | 0.7579 | 0.7260 |
| 0.2987 | 0.36 | 22500 | 0.8711 | 0.7430 | 0.2983 | 0.7119 | 0.7769 |
| 0.3071 | 0.37 | 23000 | 0.8739 | 0.7407 | 0.3167 | 0.7306 | 0.7510 |
| 0.2846 | 0.38 | 23500 | 0.8801 | 0.7401 | 0.2901 | 0.7707 | 0.7118 |
| 0.2924 | 0.38 | 24000 | 0.863 | 0.7299 | 0.3155 | 0.6922 | 0.7719 |
| 0.2938 | 0.39 | 24500 | 0.8724 | 0.7368 | 0.2973 | 0.7290 | 0.7448 |
| 0.2917 | 0.4 | 25000 | 0.8772 | 0.7436 | 0.2939 | 0.7446 | 0.7427 |
| 0.294 | 0.41 | 25500 | 0.8772 | 0.7394 | 0.2944 | 0.7528 | 0.7264 |
| 0.2979 | 0.42 | 26000 | 0.8774 | 0.7421 | 0.2819 | 0.7487 | 0.7356 |
| 0.2884 | 0.42 | 26500 | 0.873 | 0.7394 | 0.2932 | 0.7278 | 0.7515 |
| 0.2992 | 0.43 | 27000 | 0.8655 | 0.7419 | 0.3053 | 0.6872 | 0.8061 |
| 0.3018 | 0.44 | 27500 | 0.8788 | 0.7296 | 0.2781 | 0.7845 | 0.6818 |
| 0.305 | 0.45 | 28000 | 0.8785 | 0.7408 | 0.2760 | 0.7584 | 0.7239 |
| 0.2918 | 0.46 | 28500 | 0.8788 | 0.7381 | 0.2826 | 0.7659 | 0.7123 |
| 0.2998 | 0.46 | 29000 | 0.874 | 0.7403 | 0.2893 | 0.7319 | 0.7490 |
| 0.2875 | 0.47 | 29500 | 0.8803 | 0.7422 | 0.2891 | 0.7675 | 0.7185 |
| 0.2946 | 0.48 | 30000 | 0.2781 | 0.8798 | 0.7415 | 0.7656 | 0.7534 |
| 0.2907 | 0.49 | 30500 | 0.2860 | 0.8752 | 0.7280 | 0.7656 | 0.7463 |
| 0.2981 | 0.5 | 31000 | 0.3012 | 0.8732 | 0.7276 | 0.7531 | 0.7402 |
| 0.2948 | 0.5 | 31500 | 0.2777 | 0.8792 | 0.7894 | 0.6768 | 0.7288 |
| 0.2933 | 0.51 | 32000 | 0.2839 | 0.8773 | 0.7428 | 0.7469 | 0.7449 |
| 0.2891 | 0.52 | 32500 | 0.2774 | 0.8795 | 0.7678 | 0.7131 | 0.7395 |
| 0.2869 | 0.53 | 33000 | 0.2790 | 0.8764 | 0.7405 | 0.7460 | 0.7432 |
| 0.2907 | 0.54 | 33500 | 0.2889 | 0.8764 | 0.7580 | 0.7118 | 0.7342 |
| 0.2912 | 0.54 | 34000 | 0.2887 | 0.8807 | 0.7464 | 0.7611 | 0.7537 |
| 0.283 | 0.55 | 34500 | 0.2754 | 0.8816 | 0.7847 | 0.6977 | 0.7386 |
| 0.2877 | 0.56 | 35000 | 0.3036 | 0.8727 | 0.7221 | 0.7627 | 0.7418 |
| 0.2923 | 0.57 | 35500 | 0.2853 | 0.8783 | 0.7693 | 0.7035 | 0.7349 |
| 0.2902 | 0.58 | 36000 | 0.2881 | 0.8772 | 0.7462 | 0.7394 | 0.7428 |
| 0.2863 | 0.58 | 36500 | 0.2886 | 0.8768 | 0.7303 | 0.7711 | 0.7501 |
| 0.2837 | 0.59 | 37000 | 0.2753 | 0.8801 | 0.7503 | 0.7494 | 0.7498 |
| 0.3021 | 0.6 | 37500 | 0.2848 | 0.8775 | 0.7330 | 0.7694 | 0.7508 |
| 0.291 | 0.61 | 38000 | 0.2793 | 0.88 | 0.7423 | 0.7652 | 0.7536 |
| 0.2821 | 0.62 | 38500 | 0.2867 | 0.88 | 0.7429 | 0.7640 | 0.7533 |
| 0.2867 | 0.62 | 39000 | 0.2851 | 0.8796 | 0.7367 | 0.7748 | 0.7553 |
| 0.2846 | 0.63 | 39500 | 0.2813 | 0.8828 | 0.7661 | 0.7360 | 0.7507 |
| 0.2836 | 0.64 | 40000 | 0.2842 | 0.8793 | 0.7406 | 0.7644 | 0.7523 |
| 0.2835 | 0.65 | 40500 | 0.2797 | 0.8792 | 0.7382 | 0.7690 | 0.7533 |
| 0.2833 | 0.66 | 41000 | 0.2763 | 0.8821 | 0.7895 | 0.6931 | 0.7382 |
| 0.2743 | 0.66 | 41500 | 0.2852 | 0.8833 | 0.7717 | 0.7289 | 0.7497 |
| 0.2921 | 0.67 | 42000 | 0.2780 | 0.8791 | 0.7561 | 0.7319 | 0.7438 |
| 0.279 | 0.68 | 42500 | 0.2759 | 0.8827 | 0.7882 | 0.6985 | 0.7407 |
| 0.2752 | 0.69 | 43000 | 0.2795 | 0.8796 | 0.7642 | 0.7202 | 0.7415 |
| 0.2902 | 0.7 | 43500 | 0.2735 | 0.8809 | 0.7824 | 0.6972 | 0.7374 |
| 0.2832 | 0.7 | 44000 | 0.2742 | 0.8815 | 0.7690 | 0.7231 | 0.7453 |
| 0.2783 | 0.71 | 44500 | 0.2773 | 0.8815 | 0.7692 | 0.7227 | 0.7452 |
| 0.2879 | 0.72 | 45000 | 0.2716 | 0.8838 | 0.7766 | 0.7235 | 0.7491 |
| 0.2898 | 0.73 | 45500 | 0.2728 | 0.8804 | 0.7513 | 0.7494 | 0.7503 |
| 0.2771 | 0.74 | 46000 | 0.2795 | 0.877 | 0.7370 | 0.7573 | 0.7470 |
| 0.2743 | 0.74 | 46500 | 0.2833 | 0.8707 | 0.7013 | 0.8028 | 0.7486 |
| 0.2868 | 0.75 | 47000 | 0.2719 | 0.8821 | 0.7575 | 0.7477 | 0.7526 |
| 0.2771 | 0.76 | 47500 | 0.2784 | 0.8833 | 0.7636 | 0.7435 | 0.7534 |
| 0.2824 | 0.77 | 48000 | 0.2778 | 0.8772 | 0.7291 | 0.7765 | 0.7520 |
| 0.2819 | 0.78 | 48500 | 0.2772 | 0.8825 | 0.7532 | 0.7585 | 0.7559 |
| 0.2781 | 0.78 | 49000 | 0.2747 | 0.881 | 0.7502 | 0.7552 | 0.7527 |
| 0.2844 | 0.79 | 49500 | 0.2877 | 0.8762 | 0.7215 | 0.7877 | 0.7532 |
| 0.2732 | 0.8 | 50000 | 0.2738 | 0.8809 | 0.7511 | 0.7527 | 0.7519 |
| 0.2681 | 0.81 | 50500 | 0.2832 | 0.8761 | 0.7191 | 0.7932 | 0.7543 |
| 0.2795 | 0.82 | 51000 | 0.2755 | 0.8856 | 0.7876 | 0.7160 | 0.7501 |
| 0.2649 | 0.82 | 51500 | 0.2797 | 0.8805 | 0.7360 | 0.7823 | 0.7584 |
| 0.2776 | 0.83 | 52000 | 0.2671 | 0.8833 | 0.7627 | 0.7452 | 0.7538 |
| 0.2762 | 0.84 | 52500 | 0.2745 | 0.8812 | 0.7416 | 0.7744 | 0.7576 |
| 0.2803 | 0.85 | 53000 | 0.2766 | 0.8847 | 0.7694 | 0.7415 | 0.7551 |
| 0.2675 | 0.86 | 53500 | 0.2742 | 0.8785 | 0.7392 | 0.7623 | 0.7506 |
| 0.2725 | 0.86 | 54000 | 0.2720 | 0.8826 | 0.7576 | 0.7506 | 0.7541 |
| 0.2693 | 0.87 | 54500 | 0.2739 | 0.8836 | 0.7650 | 0.7427 | 0.7537 |
| 0.2745 | 0.88 | 55000 | 0.2751 | 0.8792 | 0.7348 | 0.7765 | 0.7551 |
| 0.273 | 0.89 | 55500 | 0.2762 | 0.8812 | 0.7388 | 0.7807 | 0.7591 |
| 0.2645 | 0.9 | 56000 | 0.2664 | 0.8828 | 0.7647 | 0.7385 | 0.7514 |
| 0.2698 | 0.9 | 56500 | 0.2728 | 0.8814 | 0.7467 | 0.7648 | 0.7557 |
| 0.2771 | 0.91 | 57000 | 0.2681 | 0.8839 | 0.7635 | 0.7473 | 0.7553 |
| 0.2663 | 0.92 | 57500 | 0.2715 | 0.885 | 0.7617 | 0.7573 | 0.7595 |
| 0.2546 | 0.93 | 58000 | 0.2836 | 0.8796 | 0.7323 | 0.7848 | 0.7576 |
| 0.2752 | 0.94 | 58500 | 0.2747 | 0.8801 | 0.7363 | 0.7790 | 0.7570 |
| 0.2645 | 0.94 | 59000 | 0.2733 | 0.8834 | 0.7484 | 0.7740 | 0.7610 |
| 0.2561 | 0.95 | 59500 | 0.2765 | 0.8828 | 0.7508 | 0.7652 | 0.7580 |
| 0.2753 | 0.96 | 60000 | 0.2721 | 0.8815 | 0.7483 | 0.7623 | 0.7552 |
| 0.251 | 0.97 | 60500 | 0.2735 | 0.8822 | 0.7546 | 0.7540 | 0.7543 |
| 0.2742 | 0.98 | 61000 | 0.2721 | 0.8831 | 0.7497 | 0.7694 | 0.7594 |
| 0.2734 | 0.98 | 61500 | 0.2712 | 0.8836 | 0.7512 | 0.7694 | 0.7602 |
| 0.2713 | 0.99 | 62000 | 0.2690 | 0.8836 | 0.7556 | 0.7606 | 0.7581 |
| 0.2764 | 1.0 | 62500 | 0.2689 | 0.8833 | 0.7551 | 0.7598 | 0.7574 |
### Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0