base_model_base_tokenizer

This model is a fine-tuned version of t5-base on the code_search_net dataset. It achieves the following results on the evaluation set:

Loss: 2.1017
Bleu: 0.0744
Precisions: [0.37389569483256924, 0.14063645643779682, 0.07580332788787783, 0.045527148854836816]
Brevity Penalty: 0.6407
Length Ratio: 0.6920
Translation Length: 585436
Reference Length: 846059

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Bleu	Brevity Penalty	Length Ratio	Validation Loss	Precisions	Reference Length	Translation Length
2.4273	1.0	25762	0.0665	0.6794	0.7212	2.3438	[0.34926724858481134, 0.12159425046725157, 0.062078959459937084, 0.03489467043820187]	846059	610166
2.3512	2.0	51524	0.0733	0.7181	0.7512	2.2643	[0.3534451290507329, 0.1262343107830303, 0.06531254968421979, 0.03721425521409004]	846059	635564
2.2525	3.0	77286	0.0691	0.6453	0.6954	2.2234	[0.36523755211936504, 0.1318932094567742, 0.06891201805888993, 0.03961906221856018]	846059	588313
2.2252	4.0	103048	0.0726	0.7043	0.7404	2.1949	[0.3601686933924165, 0.1283373434960897, 0.06578382296859486, 0.0371541685491374]	846059	626462
2.1523	5.0	128810	0.0703	0.6506	0.6994	2.1769	[0.3663069159346027, 0.1334874876878427, 0.06959109409366254, 0.040003198275976946]	846059	591706
2.1027	6.0	154572	0.0650	0.5879	0.6531	2.1585	[0.37335963586676196, 0.13614151644150174, 0.07119404952304512, 0.04138235959446398]	846059	552545
2.0458	7.0	180334	0.0682	0.6176	0.6748	2.1491	[0.37062538973004405, 0.1355146147678402, 0.07123664846902444, 0.04155352506292986]	846059	570908
2.0594	8.0	206096	0.0702	0.6407	0.6919	2.1403	[0.3700899171204657, 0.13524405355792343, 0.07062960711230036, 0.04081911815137772]	846059	585428
2.0459	9.0	231858	0.0635	0.5682	0.6388	2.1327	[0.37916909499625345, 0.13810659289354987, 0.07176079868122479, 0.04160453545539102]	846059	540495
2.0029	10.0	257620	0.0684	0.6128	0.6713	2.1264	[0.3745439691237164, 0.13731087325347474, 0.07204645620574554, 0.04194087964799725]	846059	567944
2.0107	11.0	283382	0.0697	0.6139	0.6721	2.1202	[0.37538600600727345, 0.13908031254002817, 0.07356968494927149, 0.04326375560457764]	846059	568644
1.995	12.0	309144	0.0790	0.7220	0.7543	2.1192	[0.3595232536092102, 0.1336969667453998, 0.07124298456393582, 0.04192048242921579]	846059	638159
1.9653	13.0	334906	0.0750	0.6727	0.7161	2.1158	[0.3663186076760047, 0.13635359040297698, 0.07246562633002641, 0.04279559846361466]	846059	605836
1.9811	14.0	360668	0.0718	0.6325	0.6858	2.1096	[0.37342310979981247, 0.13867710694415825, 0.0736328303569596, 0.043440268414579084]	846059	580256
1.9745	15.0	386430	0.0741	0.6592	0.7059	2.1060	[0.36869699176985743, 0.13724429728380805, 0.07301699268383118, 0.04318353520566863]	846059	597195
1.939	16.0	412192	0.0706	0.6166	0.6740	2.1063	[0.37537898781101553, 0.13979047848408885, 0.0742785001701673, 0.04399835661136439]	846059	570269
1.9177	17.0	437954	0.0757	0.6671	0.7118	2.1063	[0.37017425883954735, 0.13833476986726426, 0.07389756751525232, 0.04386076232849102]	846059	602265
1.9265	18.0	463716	0.0717	0.6192	0.6760	2.1016	[0.37650650333865443, 0.14089062050951845, 0.075366455530664, 0.045028150012067114]	846059	571937
1.9622	19.0	489478	0.0730	0.6288	0.6831	2.1022	[0.3746837721013452, 0.1407333566053557, 0.07570910522025132, 0.045477562304123496]	846059	577906
1.9171	20.0	515240	2.1017	0.0744	[0.37389569483256924, 0.14063645643779682, 0.07580332788787783, 0.045527148854836816]	0.6407	0.6920	585436	846059

Framework versions

Transformers 4.37.2
Pytorch 2.2.0+cu121
Datasets 2.17.0
Tokenizers 0.15.2

sc20fg
/

base_model_base_tokenizer

base_model_base_tokenizer

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for sc20fg/base_model_base_tokenizer

Dataset used to train sc20fg/base_model_base_tokenizer

Evaluation results