Edit model card

base_model_base_tokenizer

This model is a fine-tuned version of t5-base on the code_search_net dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1017
  • Bleu: 0.0744
  • Precisions: [0.37389569483256924, 0.14063645643779682, 0.07580332788787783, 0.045527148854836816]
  • Brevity Penalty: 0.6407
  • Length Ratio: 0.6920
  • Translation Length: 585436
  • Reference Length: 846059

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Bleu Brevity Penalty Length Ratio Validation Loss Precisions Reference Length Translation Length
2.4273 1.0 25762 0.0665 0.6794 0.7212 2.3438 [0.34926724858481134, 0.12159425046725157, 0.062078959459937084, 0.03489467043820187] 846059 610166
2.3512 2.0 51524 0.0733 0.7181 0.7512 2.2643 [0.3534451290507329, 0.1262343107830303, 0.06531254968421979, 0.03721425521409004] 846059 635564
2.2525 3.0 77286 0.0691 0.6453 0.6954 2.2234 [0.36523755211936504, 0.1318932094567742, 0.06891201805888993, 0.03961906221856018] 846059 588313
2.2252 4.0 103048 0.0726 0.7043 0.7404 2.1949 [0.3601686933924165, 0.1283373434960897, 0.06578382296859486, 0.0371541685491374] 846059 626462
2.1523 5.0 128810 0.0703 0.6506 0.6994 2.1769 [0.3663069159346027, 0.1334874876878427, 0.06959109409366254, 0.040003198275976946] 846059 591706
2.1027 6.0 154572 0.0650 0.5879 0.6531 2.1585 [0.37335963586676196, 0.13614151644150174, 0.07119404952304512, 0.04138235959446398] 846059 552545
2.0458 7.0 180334 0.0682 0.6176 0.6748 2.1491 [0.37062538973004405, 0.1355146147678402, 0.07123664846902444, 0.04155352506292986] 846059 570908
2.0594 8.0 206096 0.0702 0.6407 0.6919 2.1403 [0.3700899171204657, 0.13524405355792343, 0.07062960711230036, 0.04081911815137772] 846059 585428
2.0459 9.0 231858 0.0635 0.5682 0.6388 2.1327 [0.37916909499625345, 0.13810659289354987, 0.07176079868122479, 0.04160453545539102] 846059 540495
2.0029 10.0 257620 0.0684 0.6128 0.6713 2.1264 [0.3745439691237164, 0.13731087325347474, 0.07204645620574554, 0.04194087964799725] 846059 567944
2.0107 11.0 283382 0.0697 0.6139 0.6721 2.1202 [0.37538600600727345, 0.13908031254002817, 0.07356968494927149, 0.04326375560457764] 846059 568644
1.995 12.0 309144 0.0790 0.7220 0.7543 2.1192 [0.3595232536092102, 0.1336969667453998, 0.07124298456393582, 0.04192048242921579] 846059 638159
1.9653 13.0 334906 0.0750 0.6727 0.7161 2.1158 [0.3663186076760047, 0.13635359040297698, 0.07246562633002641, 0.04279559846361466] 846059 605836
1.9811 14.0 360668 0.0718 0.6325 0.6858 2.1096 [0.37342310979981247, 0.13867710694415825, 0.0736328303569596, 0.043440268414579084] 846059 580256
1.9745 15.0 386430 0.0741 0.6592 0.7059 2.1060 [0.36869699176985743, 0.13724429728380805, 0.07301699268383118, 0.04318353520566863] 846059 597195
1.939 16.0 412192 0.0706 0.6166 0.6740 2.1063 [0.37537898781101553, 0.13979047848408885, 0.0742785001701673, 0.04399835661136439] 846059 570269
1.9177 17.0 437954 0.0757 0.6671 0.7118 2.1063 [0.37017425883954735, 0.13833476986726426, 0.07389756751525232, 0.04386076232849102] 846059 602265
1.9265 18.0 463716 0.0717 0.6192 0.6760 2.1016 [0.37650650333865443, 0.14089062050951845, 0.075366455530664, 0.045028150012067114] 846059 571937
1.9622 19.0 489478 0.0730 0.6288 0.6831 2.1022 [0.3746837721013452, 0.1407333566053557, 0.07570910522025132, 0.045477562304123496] 846059 577906
1.9171 20.0 515240 2.1017 0.0744 [0.37389569483256924, 0.14063645643779682, 0.07580332788787783, 0.045527148854836816] 0.6407 0.6920 585436 846059

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.2
Downloads last month
2
Safetensors
Model size
223M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for sc20fg/base_model_base_tokenizer

Base model

google-t5/t5-base
Finetuned
(353)
this model

Dataset used to train sc20fg/base_model_base_tokenizer

Evaluation results