--- license: cc-by-nc-4.0 base_model: davidberenstein1957/ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter0 tags: - alignment-handbook - generated_from_trainer datasets: - davidberenstein1957/ultra-feedback-dutch-cleaned-hq_iter0 - davidberenstein1957/ultra-feedback-dutch-cleaned-hq_iter1 model-index: - name: outputs results: [] --- # outputs This model is a fine-tuned version of [davidberenstein1957/ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter0](https://huggingface.co/davidberenstein1957/ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter0) on the davidberenstein1957/ultra-feedback-dutch-cleaned-hq_iter0 and the davidberenstein1957/ultra-feedback-dutch-cleaned-hq_iter1 datasets. It achieves the following results on the evaluation set: - Loss: 0.0380 - Rewards/real: -5.1867 - Rewards/generated: -23.6116 - Rewards/accuracies: 0.9778 - Rewards/margins: 18.4250 - Logps/generated: -690.4515 - Logps/real: -469.2089 - Logits/generated: -1.6815 - Logits/real: -2.1280 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-07 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - distributed_type: multi-GPU - num_devices: 4 - gradient_accumulation_steps: 2 - total_train_batch_size: 64 - total_eval_batch_size: 32 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 2 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rewards/real | Rewards/generated | Rewards/accuracies | Rewards/margins | Logps/generated | Logps/real | Logits/generated | Logits/real | |:-------------:|:-----:|:----:|:---------------:|:------------:|:-----------------:|:------------------:|:---------------:|:---------------:|:----------:|:----------------:|:-----------:| | 0.591 | 0.04 | 25 | 0.4210 | -0.2501 | -1.0788 | 0.8500 | 0.8287 | -465.1227 | -419.8426 | -2.6984 | -2.7096 | | 0.2223 | 0.08 | 50 | 0.2173 | -0.5659 | -3.0876 | 0.9176 | 2.5217 | -485.2113 | -423.0011 | -2.6306 | -2.6446 | | 0.168 | 0.12 | 75 | 0.1532 | -0.7060 | -4.4771 | 0.9435 | 3.7711 | -499.1060 | -424.4022 | -2.5832 | -2.6005 | | 0.1126 | 0.16 | 100 | 0.1218 | -1.2746 | -6.3162 | 0.9509 | 5.0415 | -517.4969 | -430.0886 | -2.5961 | -2.6118 | | 0.0854 | 0.21 | 125 | 0.0921 | -1.7944 | -9.0378 | 0.9611 | 7.2433 | -544.7130 | -435.2866 | -2.5534 | -2.5859 | | 0.0609 | 0.25 | 150 | 0.0738 | -1.6860 | -9.1926 | 0.9639 | 7.5065 | -546.2610 | -434.2025 | -2.5875 | -2.6239 | | 0.0654 | 0.29 | 175 | 0.0733 | -2.0360 | -9.8189 | 0.9648 | 7.7828 | -552.5237 | -437.7025 | -2.5252 | -2.5698 | | 0.0814 | 0.33 | 200 | 0.0714 | -2.3341 | -10.2294 | 0.9630 | 7.8952 | -556.6287 | -440.6832 | -2.4634 | -2.5260 | | 0.0356 | 0.37 | 225 | 0.0698 | -2.6697 | -11.4164 | 0.9667 | 8.7467 | -568.4990 | -444.0394 | -2.4311 | -2.5142 | | 0.0641 | 0.41 | 250 | 0.0586 | -2.3926 | -12.3053 | 0.9694 | 9.9126 | -577.3877 | -441.2684 | -2.3106 | -2.4202 | | 0.0442 | 0.45 | 275 | 0.0672 | -2.5170 | -11.9462 | 0.9676 | 9.4293 | -573.7975 | -442.5117 | -2.3880 | -2.4773 | | 0.0707 | 0.49 | 300 | 0.0540 | -3.8488 | -15.1469 | 0.9667 | 11.2982 | -605.8044 | -455.8299 | -2.2564 | -2.3913 | | 0.0683 | 0.53 | 325 | 0.0574 | -5.2977 | -18.2377 | 0.9667 | 12.9400 | -636.7123 | -470.3190 | -2.1402 | -2.3222 | | 0.0339 | 0.58 | 350 | 0.0495 | -3.7486 | -17.2926 | 0.9731 | 13.5439 | -627.2608 | -454.8286 | -2.1701 | -2.3731 | | 0.0648 | 0.62 | 375 | 0.0537 | -2.4302 | -13.2604 | 0.9722 | 10.8301 | -586.9390 | -441.6444 | -2.3167 | -2.4783 | | 0.0358 | 0.66 | 400 | 0.0460 | -3.8509 | -17.3389 | 0.9741 | 13.4880 | -627.7241 | -455.8509 | -2.1735 | -2.3874 | | 0.0532 | 0.7 | 425 | 0.0483 | -4.3261 | -18.2030 | 0.9741 | 13.8769 | -636.3655 | -460.6029 | -2.1550 | -2.3751 | | 0.0408 | 0.74 | 450 | 0.0567 | -4.8885 | -19.7272 | 0.9741 | 14.8387 | -651.6073 | -466.2276 | -2.2982 | -2.4811 | | 0.0434 | 0.78 | 475 | 0.0467 | -2.8677 | -16.1120 | 0.9731 | 13.2443 | -615.4548 | -446.0187 | -2.1937 | -2.4242 | | 0.0194 | 0.82 | 500 | 0.0455 | -3.2473 | -18.4707 | 0.9769 | 15.2234 | -639.0422 | -449.8151 | -2.0107 | -2.3291 | | 0.0227 | 0.86 | 525 | 0.0543 | -4.5805 | -20.1131 | 0.9750 | 15.5326 | -655.4664 | -463.1471 | -2.2146 | -2.4100 | | 0.0299 | 0.91 | 550 | 0.0481 | -4.3021 | -20.3869 | 0.9731 | 16.0848 | -658.2037 | -460.3627 | -2.0552 | -2.3301 | | 0.0218 | 0.95 | 575 | 0.0464 | -4.4619 | -20.3587 | 0.9713 | 15.8967 | -657.9220 | -461.9616 | -1.9225 | -2.2635 | | 0.0218 | 0.99 | 600 | 0.0451 | -5.3210 | -20.9811 | 0.9722 | 15.6602 | -664.1465 | -470.5517 | -1.9518 | -2.2964 | | 0.0093 | 1.03 | 625 | 0.0429 | -4.3395 | -19.2716 | 0.9750 | 14.9321 | -647.0515 | -460.7374 | -1.7575 | -2.1708 | | 0.0173 | 1.07 | 650 | 0.0492 | -4.1317 | -19.0745 | 0.9704 | 14.9428 | -645.0802 | -458.6593 | -1.8155 | -2.1757 | | 0.0059 | 1.11 | 675 | 0.0449 | -5.7336 | -23.1577 | 0.9713 | 17.4241 | -685.9126 | -474.6784 | -1.6844 | -2.1123 | | 0.0149 | 1.15 | 700 | 0.0608 | -7.1484 | -26.1989 | 0.9713 | 19.0504 | -716.3237 | -488.8266 | -2.0142 | -2.2748 | | 0.0105 | 1.19 | 725 | 0.0479 | -4.4948 | -20.2513 | 0.9722 | 15.7564 | -656.8477 | -462.2903 | -2.1674 | -2.3962 | | 0.032 | 1.23 | 750 | 0.0512 | -5.0950 | -21.3230 | 0.9685 | 16.2280 | -667.5649 | -468.2917 | -2.2426 | -2.4414 | | 0.0042 | 1.28 | 775 | 0.0462 | -4.0296 | -19.2620 | 0.9704 | 15.2324 | -646.9548 | -457.6381 | -2.2156 | -2.4379 | | 0.0041 | 1.32 | 800 | 0.0475 | -4.0348 | -19.8410 | 0.9731 | 15.8062 | -652.7453 | -457.6903 | -2.1330 | -2.3843 | | 0.0075 | 1.36 | 825 | 0.0428 | -4.4696 | -20.8584 | 0.9722 | 16.3888 | -662.9192 | -462.0378 | -2.1122 | -2.3718 | | 0.004 | 1.4 | 850 | 0.0468 | -6.2822 | -25.6273 | 0.9750 | 19.3451 | -710.6078 | -480.1642 | -1.7240 | -2.1709 | | 0.0222 | 1.44 | 875 | 0.0584 | -6.0399 | -23.0778 | 0.9759 | 17.0379 | -685.1132 | -477.7408 | -1.6544 | -2.1242 | | 0.0063 | 1.48 | 900 | 0.0490 | -3.8721 | -19.8020 | 0.9722 | 15.9298 | -652.3550 | -456.0635 | -1.7696 | -2.2026 | | 0.006 | 1.52 | 925 | 0.0478 | -5.2822 | -23.7504 | 0.9750 | 18.4682 | -691.8392 | -470.1639 | -1.6461 | -2.1239 | | 0.0169 | 1.56 | 950 | 0.0455 | -4.9375 | -22.9431 | 0.9731 | 18.0057 | -683.7665 | -466.7169 | -1.6890 | -2.1447 | | 0.0063 | 1.6 | 975 | 0.0449 | -5.9782 | -25.0564 | 0.9741 | 19.0782 | -704.8994 | -477.1242 | -1.5890 | -2.0779 | | 0.0144 | 1.65 | 1000 | 0.0428 | -5.2622 | -22.9304 | 0.9731 | 17.6682 | -683.6391 | -469.9639 | -1.6262 | -2.0859 | | 0.0046 | 1.69 | 1025 | 0.0411 | -5.5146 | -24.0845 | 0.9759 | 18.5698 | -695.1800 | -472.4886 | -1.6070 | -2.0934 | | 0.002 | 1.73 | 1050 | 0.0408 | -5.4174 | -23.7610 | 0.9750 | 18.3436 | -691.9457 | -471.5163 | -1.6779 | -2.1277 | | 0.0047 | 1.77 | 1075 | 0.0411 | -5.6837 | -24.5512 | 0.9750 | 18.8674 | -699.8467 | -474.1796 | -1.7048 | -2.1412 | | 0.0077 | 1.81 | 1100 | 0.0404 | -5.8712 | -25.3478 | 0.9759 | 19.4766 | -707.8129 | -476.0543 | -1.6257 | -2.0917 | | 0.0145 | 1.85 | 1125 | 0.0385 | -5.0758 | -23.2450 | 0.9741 | 18.1692 | -686.7853 | -468.0999 | -1.6509 | -2.1029 | | 0.0038 | 1.89 | 1150 | 0.0376 | -5.2077 | -23.5236 | 0.9759 | 18.3159 | -689.5715 | -469.4194 | -1.6736 | -2.1249 | | 0.01 | 1.93 | 1175 | 0.0379 | -5.1247 | -23.3484 | 0.9750 | 18.2238 | -687.8193 | -468.5888 | -1.6969 | -2.1383 | | 0.0055 | 1.98 | 1200 | 0.0380 | -5.1867 | -23.6116 | 0.9778 | 18.4250 | -690.4515 | -469.2089 | -1.6815 | -2.1280 | ### Framework versions - Transformers 4.37.0 - Pytorch 2.1.2+cu121 - Datasets 2.14.6 - Tokenizers 0.15.2