2023-10-27 15:57:04,764 ---------------------------------------------------------------------------------------------------- 2023-10-27 15:57:04,765 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): XLMRobertaModel( (embeddings): XLMRobertaEmbeddings( (word_embeddings): Embedding(250003, 1024) (position_embeddings): Embedding(514, 1024, padding_idx=1) (token_type_embeddings): Embedding(1, 1024) (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): XLMRobertaEncoder( (layer): ModuleList( (0-23): 24 x XLMRobertaLayer( (attention): XLMRobertaAttention( (self): XLMRobertaSelfAttention( (query): Linear(in_features=1024, out_features=1024, bias=True) (key): Linear(in_features=1024, out_features=1024, bias=True) (value): Linear(in_features=1024, out_features=1024, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): XLMRobertaSelfOutput( (dense): Linear(in_features=1024, out_features=1024, bias=True) (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): XLMRobertaIntermediate( (dense): Linear(in_features=1024, out_features=4096, bias=True) (intermediate_act_fn): GELUActivation() ) (output): XLMRobertaOutput( (dense): Linear(in_features=4096, out_features=1024, bias=True) (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): XLMRobertaPooler( (dense): Linear(in_features=1024, out_features=1024, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=1024, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-27 15:57:04,765 ---------------------------------------------------------------------------------------------------- 2023-10-27 15:57:04,765 Corpus: 14903 train + 3449 dev + 3658 test sentences 2023-10-27 15:57:04,765 ---------------------------------------------------------------------------------------------------- 2023-10-27 15:57:04,765 Train: 14903 sentences 2023-10-27 15:57:04,766 (train_with_dev=False, train_with_test=False) 2023-10-27 15:57:04,766 ---------------------------------------------------------------------------------------------------- 2023-10-27 15:57:04,766 Training Params: 2023-10-27 15:57:04,766 - learning_rate: "5e-06" 2023-10-27 15:57:04,766 - mini_batch_size: "4" 2023-10-27 15:57:04,766 - max_epochs: "10" 2023-10-27 15:57:04,766 - shuffle: "True" 2023-10-27 15:57:04,766 ---------------------------------------------------------------------------------------------------- 2023-10-27 15:57:04,766 Plugins: 2023-10-27 15:57:04,766 - TensorboardLogger 2023-10-27 15:57:04,766 - LinearScheduler | warmup_fraction: '0.1' 2023-10-27 15:57:04,766 ---------------------------------------------------------------------------------------------------- 2023-10-27 15:57:04,766 Final evaluation on model from best epoch (best-model.pt) 2023-10-27 15:57:04,766 - metric: "('micro avg', 'f1-score')" 2023-10-27 15:57:04,766 ---------------------------------------------------------------------------------------------------- 2023-10-27 15:57:04,766 Computation: 2023-10-27 15:57:04,766 - compute on device: cuda:0 2023-10-27 15:57:04,766 - embedding storage: none 2023-10-27 15:57:04,766 ---------------------------------------------------------------------------------------------------- 2023-10-27 15:57:04,766 Model training base path: "flair-clean-conll-lr5e-06-bs4-2" 2023-10-27 15:57:04,766 ---------------------------------------------------------------------------------------------------- 2023-10-27 15:57:04,766 ---------------------------------------------------------------------------------------------------- 2023-10-27 15:57:04,766 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-27 15:57:51,345 epoch 1 - iter 372/3726 - loss 3.66933019 - time (sec): 46.58 - samples/sec: 441.14 - lr: 0.000000 - momentum: 0.000000 2023-10-27 15:58:37,240 epoch 1 - iter 744/3726 - loss 2.44791196 - time (sec): 92.47 - samples/sec: 440.81 - lr: 0.000001 - momentum: 0.000000 2023-10-27 15:59:23,004 epoch 1 - iter 1116/3726 - loss 1.82180853 - time (sec): 138.24 - samples/sec: 444.18 - lr: 0.000001 - momentum: 0.000000 2023-10-27 16:00:08,910 epoch 1 - iter 1488/3726 - loss 1.46511605 - time (sec): 184.14 - samples/sec: 445.62 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:00:55,551 epoch 1 - iter 1860/3726 - loss 1.23020473 - time (sec): 230.78 - samples/sec: 444.20 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:01:41,835 epoch 1 - iter 2232/3726 - loss 1.05969433 - time (sec): 277.07 - samples/sec: 443.08 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:02:28,579 epoch 1 - iter 2604/3726 - loss 0.92870944 - time (sec): 323.81 - samples/sec: 443.41 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:03:15,307 epoch 1 - iter 2976/3726 - loss 0.83025530 - time (sec): 370.54 - samples/sec: 441.38 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:04:02,180 epoch 1 - iter 3348/3726 - loss 0.75373492 - time (sec): 417.41 - samples/sec: 439.59 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:04:49,217 epoch 1 - iter 3720/3726 - loss 0.68664292 - time (sec): 464.45 - samples/sec: 439.63 - lr: 0.000005 - momentum: 0.000000 2023-10-27 16:04:49,995 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:04:49,996 EPOCH 1 done: loss 0.6854 - lr: 0.000005 2023-10-27 16:05:15,688 DEV : loss 0.06499314308166504 - f1-score (micro avg) 0.941 2023-10-27 16:05:15,743 saving best model 2023-10-27 16:05:17,851 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:06:05,511 epoch 2 - iter 372/3726 - loss 0.08608847 - time (sec): 47.66 - samples/sec: 436.63 - lr: 0.000005 - momentum: 0.000000 2023-10-27 16:06:53,421 epoch 2 - iter 744/3726 - loss 0.08159160 - time (sec): 95.57 - samples/sec: 433.86 - lr: 0.000005 - momentum: 0.000000 2023-10-27 16:07:40,883 epoch 2 - iter 1116/3726 - loss 0.08672812 - time (sec): 143.03 - samples/sec: 434.04 - lr: 0.000005 - momentum: 0.000000 2023-10-27 16:08:28,410 epoch 2 - iter 1488/3726 - loss 0.08683755 - time (sec): 190.56 - samples/sec: 432.29 - lr: 0.000005 - momentum: 0.000000 2023-10-27 16:09:15,037 epoch 2 - iter 1860/3726 - loss 0.08779187 - time (sec): 237.18 - samples/sec: 435.35 - lr: 0.000005 - momentum: 0.000000 2023-10-27 16:10:02,026 epoch 2 - iter 2232/3726 - loss 0.08712052 - time (sec): 284.17 - samples/sec: 434.32 - lr: 0.000005 - momentum: 0.000000 2023-10-27 16:10:48,962 epoch 2 - iter 2604/3726 - loss 0.08526279 - time (sec): 331.11 - samples/sec: 434.61 - lr: 0.000005 - momentum: 0.000000 2023-10-27 16:11:35,182 epoch 2 - iter 2976/3726 - loss 0.08450012 - time (sec): 377.33 - samples/sec: 434.72 - lr: 0.000005 - momentum: 0.000000 2023-10-27 16:12:21,618 epoch 2 - iter 3348/3726 - loss 0.08460079 - time (sec): 423.77 - samples/sec: 433.17 - lr: 0.000005 - momentum: 0.000000 2023-10-27 16:13:08,337 epoch 2 - iter 3720/3726 - loss 0.08261905 - time (sec): 470.48 - samples/sec: 434.27 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:13:09,112 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:13:09,112 EPOCH 2 done: loss 0.0825 - lr: 0.000004 2023-10-27 16:13:33,111 DEV : loss 0.08286476135253906 - f1-score (micro avg) 0.9546 2023-10-27 16:13:33,170 saving best model 2023-10-27 16:13:35,742 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:14:22,419 epoch 3 - iter 372/3726 - loss 0.05591265 - time (sec): 46.67 - samples/sec: 435.31 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:15:09,686 epoch 3 - iter 744/3726 - loss 0.05984730 - time (sec): 93.94 - samples/sec: 434.32 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:15:57,178 epoch 3 - iter 1116/3726 - loss 0.06005216 - time (sec): 141.43 - samples/sec: 435.00 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:16:45,692 epoch 3 - iter 1488/3726 - loss 0.05601000 - time (sec): 189.95 - samples/sec: 430.14 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:17:32,939 epoch 3 - iter 1860/3726 - loss 0.05476618 - time (sec): 237.20 - samples/sec: 426.95 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:18:20,145 epoch 3 - iter 2232/3726 - loss 0.05358297 - time (sec): 284.40 - samples/sec: 428.53 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:19:07,624 epoch 3 - iter 2604/3726 - loss 0.05384047 - time (sec): 331.88 - samples/sec: 429.32 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:19:54,617 epoch 3 - iter 2976/3726 - loss 0.05438530 - time (sec): 378.87 - samples/sec: 429.16 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:20:41,784 epoch 3 - iter 3348/3726 - loss 0.05364700 - time (sec): 426.04 - samples/sec: 430.25 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:21:28,928 epoch 3 - iter 3720/3726 - loss 0.05265148 - time (sec): 473.18 - samples/sec: 431.75 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:21:29,696 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:21:29,696 EPOCH 3 done: loss 0.0527 - lr: 0.000004 2023-10-27 16:21:53,630 DEV : loss 0.05983666330575943 - f1-score (micro avg) 0.963 2023-10-27 16:21:53,682 saving best model 2023-10-27 16:21:55,901 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:22:43,296 epoch 4 - iter 372/3726 - loss 0.03718873 - time (sec): 47.39 - samples/sec: 429.14 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:23:30,210 epoch 4 - iter 744/3726 - loss 0.04099485 - time (sec): 94.31 - samples/sec: 435.38 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:24:17,027 epoch 4 - iter 1116/3726 - loss 0.03721825 - time (sec): 141.12 - samples/sec: 434.73 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:25:04,504 epoch 4 - iter 1488/3726 - loss 0.03714011 - time (sec): 188.60 - samples/sec: 433.49 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:25:52,892 epoch 4 - iter 1860/3726 - loss 0.03758136 - time (sec): 236.99 - samples/sec: 428.95 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:26:40,944 epoch 4 - iter 2232/3726 - loss 0.03790295 - time (sec): 285.04 - samples/sec: 428.86 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:27:29,194 epoch 4 - iter 2604/3726 - loss 0.03805339 - time (sec): 333.29 - samples/sec: 428.62 - lr: 0.000004 - momentum: 0.000000 2023-10-27 16:28:16,189 epoch 4 - iter 2976/3726 - loss 0.03708819 - time (sec): 380.29 - samples/sec: 429.11 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:29:03,316 epoch 4 - iter 3348/3726 - loss 0.03680602 - time (sec): 427.41 - samples/sec: 429.64 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:29:50,404 epoch 4 - iter 3720/3726 - loss 0.03682622 - time (sec): 474.50 - samples/sec: 430.34 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:29:51,089 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:29:51,089 EPOCH 4 done: loss 0.0369 - lr: 0.000003 2023-10-27 16:30:14,916 DEV : loss 0.04883182421326637 - f1-score (micro avg) 0.9659 2023-10-27 16:30:14,971 saving best model 2023-10-27 16:30:17,459 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:31:04,080 epoch 5 - iter 372/3726 - loss 0.03340894 - time (sec): 46.62 - samples/sec: 441.00 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:31:50,991 epoch 5 - iter 744/3726 - loss 0.03438447 - time (sec): 93.53 - samples/sec: 439.30 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:32:38,716 epoch 5 - iter 1116/3726 - loss 0.03321367 - time (sec): 141.25 - samples/sec: 435.67 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:33:25,523 epoch 5 - iter 1488/3726 - loss 0.02824924 - time (sec): 188.06 - samples/sec: 435.61 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:34:12,201 epoch 5 - iter 1860/3726 - loss 0.02851437 - time (sec): 234.74 - samples/sec: 433.50 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:34:59,180 epoch 5 - iter 2232/3726 - loss 0.02789578 - time (sec): 281.72 - samples/sec: 436.78 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:35:46,777 epoch 5 - iter 2604/3726 - loss 0.02681236 - time (sec): 329.32 - samples/sec: 434.70 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:36:33,751 epoch 5 - iter 2976/3726 - loss 0.02765246 - time (sec): 376.29 - samples/sec: 432.28 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:37:20,836 epoch 5 - iter 3348/3726 - loss 0.02767176 - time (sec): 423.38 - samples/sec: 432.82 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:38:08,311 epoch 5 - iter 3720/3726 - loss 0.02792716 - time (sec): 470.85 - samples/sec: 433.69 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:38:09,077 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:38:09,077 EPOCH 5 done: loss 0.0279 - lr: 0.000003 2023-10-27 16:38:33,913 DEV : loss 0.05045438930392265 - f1-score (micro avg) 0.9709 2023-10-27 16:38:33,966 saving best model 2023-10-27 16:38:36,347 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:39:23,511 epoch 6 - iter 372/3726 - loss 0.02592894 - time (sec): 47.15 - samples/sec: 418.65 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:40:10,156 epoch 6 - iter 744/3726 - loss 0.02441091 - time (sec): 93.80 - samples/sec: 435.34 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:40:56,462 epoch 6 - iter 1116/3726 - loss 0.02083566 - time (sec): 140.10 - samples/sec: 437.89 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:41:42,045 epoch 6 - iter 1488/3726 - loss 0.01995447 - time (sec): 185.69 - samples/sec: 441.22 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:42:28,231 epoch 6 - iter 1860/3726 - loss 0.01971121 - time (sec): 231.87 - samples/sec: 442.59 - lr: 0.000003 - momentum: 0.000000 2023-10-27 16:43:13,863 epoch 6 - iter 2232/3726 - loss 0.02038473 - time (sec): 277.50 - samples/sec: 442.07 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:43:59,052 epoch 6 - iter 2604/3726 - loss 0.02010731 - time (sec): 322.69 - samples/sec: 442.05 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:44:44,618 epoch 6 - iter 2976/3726 - loss 0.02110678 - time (sec): 368.26 - samples/sec: 443.32 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:45:30,589 epoch 6 - iter 3348/3726 - loss 0.02064377 - time (sec): 414.23 - samples/sec: 443.27 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:46:15,877 epoch 6 - iter 3720/3726 - loss 0.02070977 - time (sec): 459.52 - samples/sec: 444.64 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:46:16,609 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:46:16,609 EPOCH 6 done: loss 0.0207 - lr: 0.000002 2023-10-27 16:46:39,599 DEV : loss 0.05228659138083458 - f1-score (micro avg) 0.9688 2023-10-27 16:46:39,652 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:47:25,815 epoch 7 - iter 372/3726 - loss 0.01393066 - time (sec): 46.16 - samples/sec: 453.87 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:48:11,032 epoch 7 - iter 744/3726 - loss 0.01975985 - time (sec): 91.38 - samples/sec: 465.32 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:48:57,003 epoch 7 - iter 1116/3726 - loss 0.01736626 - time (sec): 137.35 - samples/sec: 453.61 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:49:42,670 epoch 7 - iter 1488/3726 - loss 0.01602877 - time (sec): 183.02 - samples/sec: 449.60 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:50:28,056 epoch 7 - iter 1860/3726 - loss 0.01614250 - time (sec): 228.40 - samples/sec: 448.54 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:51:13,857 epoch 7 - iter 2232/3726 - loss 0.01731041 - time (sec): 274.20 - samples/sec: 447.20 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:51:59,472 epoch 7 - iter 2604/3726 - loss 0.01639037 - time (sec): 319.82 - samples/sec: 447.95 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:52:45,630 epoch 7 - iter 2976/3726 - loss 0.01622162 - time (sec): 365.98 - samples/sec: 446.28 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:53:30,732 epoch 7 - iter 3348/3726 - loss 0.01590288 - time (sec): 411.08 - samples/sec: 447.75 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:54:16,747 epoch 7 - iter 3720/3726 - loss 0.01577280 - time (sec): 457.09 - samples/sec: 446.76 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:54:17,443 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:54:17,443 EPOCH 7 done: loss 0.0157 - lr: 0.000002 2023-10-27 16:54:39,633 DEV : loss 0.05249254032969475 - f1-score (micro avg) 0.9716 2023-10-27 16:54:39,686 saving best model 2023-10-27 16:54:42,796 ---------------------------------------------------------------------------------------------------- 2023-10-27 16:55:28,427 epoch 8 - iter 372/3726 - loss 0.01008978 - time (sec): 45.63 - samples/sec: 447.29 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:56:13,841 epoch 8 - iter 744/3726 - loss 0.00993689 - time (sec): 91.04 - samples/sec: 445.29 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:56:59,449 epoch 8 - iter 1116/3726 - loss 0.00840825 - time (sec): 136.65 - samples/sec: 443.14 - lr: 0.000002 - momentum: 0.000000 2023-10-27 16:57:45,482 epoch 8 - iter 1488/3726 - loss 0.00783549 - time (sec): 182.68 - samples/sec: 441.32 - lr: 0.000001 - momentum: 0.000000 2023-10-27 16:58:31,635 epoch 8 - iter 1860/3726 - loss 0.00875476 - time (sec): 228.84 - samples/sec: 441.43 - lr: 0.000001 - momentum: 0.000000 2023-10-27 16:59:17,304 epoch 8 - iter 2232/3726 - loss 0.00997788 - time (sec): 274.51 - samples/sec: 447.12 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:00:03,903 epoch 8 - iter 2604/3726 - loss 0.01002162 - time (sec): 321.10 - samples/sec: 445.17 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:00:49,795 epoch 8 - iter 2976/3726 - loss 0.00982956 - time (sec): 367.00 - samples/sec: 443.07 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:01:35,384 epoch 8 - iter 3348/3726 - loss 0.01006193 - time (sec): 412.59 - samples/sec: 445.05 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:02:21,065 epoch 8 - iter 3720/3726 - loss 0.01018978 - time (sec): 458.27 - samples/sec: 445.76 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:02:21,762 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:02:21,762 EPOCH 8 done: loss 0.0102 - lr: 0.000001 2023-10-27 17:02:44,780 DEV : loss 0.05600257217884064 - f1-score (micro avg) 0.9717 2023-10-27 17:02:44,832 saving best model 2023-10-27 17:02:47,541 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:03:33,194 epoch 9 - iter 372/3726 - loss 0.00852829 - time (sec): 45.65 - samples/sec: 446.98 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:04:18,797 epoch 9 - iter 744/3726 - loss 0.01209549 - time (sec): 91.25 - samples/sec: 442.36 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:05:04,412 epoch 9 - iter 1116/3726 - loss 0.01171120 - time (sec): 136.87 - samples/sec: 446.88 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:05:49,939 epoch 9 - iter 1488/3726 - loss 0.01104234 - time (sec): 182.39 - samples/sec: 448.01 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:06:35,656 epoch 9 - iter 1860/3726 - loss 0.01095518 - time (sec): 228.11 - samples/sec: 444.74 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:07:21,859 epoch 9 - iter 2232/3726 - loss 0.01041938 - time (sec): 274.31 - samples/sec: 445.26 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:08:07,175 epoch 9 - iter 2604/3726 - loss 0.01077364 - time (sec): 319.63 - samples/sec: 446.97 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:08:52,206 epoch 9 - iter 2976/3726 - loss 0.01011920 - time (sec): 364.66 - samples/sec: 448.47 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:09:37,411 epoch 9 - iter 3348/3726 - loss 0.00960798 - time (sec): 409.87 - samples/sec: 448.71 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:10:23,015 epoch 9 - iter 3720/3726 - loss 0.00963949 - time (sec): 455.47 - samples/sec: 448.69 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:10:23,789 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:10:23,789 EPOCH 9 done: loss 0.0096 - lr: 0.000001 2023-10-27 17:10:47,419 DEV : loss 0.053138185292482376 - f1-score (micro avg) 0.9726 2023-10-27 17:10:47,471 saving best model 2023-10-27 17:10:50,135 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:11:35,418 epoch 10 - iter 372/3726 - loss 0.00478465 - time (sec): 45.28 - samples/sec: 451.34 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:12:21,078 epoch 10 - iter 744/3726 - loss 0.00483843 - time (sec): 90.94 - samples/sec: 449.97 - lr: 0.000000 - momentum: 0.000000 2023-10-27 17:13:06,334 epoch 10 - iter 1116/3726 - loss 0.00472956 - time (sec): 136.20 - samples/sec: 449.54 - lr: 0.000000 - momentum: 0.000000 2023-10-27 17:13:51,612 epoch 10 - iter 1488/3726 - loss 0.00451912 - time (sec): 181.47 - samples/sec: 451.84 - lr: 0.000000 - momentum: 0.000000 2023-10-27 17:14:37,168 epoch 10 - iter 1860/3726 - loss 0.00470044 - time (sec): 227.03 - samples/sec: 451.55 - lr: 0.000000 - momentum: 0.000000 2023-10-27 17:15:22,745 epoch 10 - iter 2232/3726 - loss 0.00497575 - time (sec): 272.61 - samples/sec: 452.99 - lr: 0.000000 - momentum: 0.000000 2023-10-27 17:16:08,737 epoch 10 - iter 2604/3726 - loss 0.00499748 - time (sec): 318.60 - samples/sec: 450.83 - lr: 0.000000 - momentum: 0.000000 2023-10-27 17:16:54,804 epoch 10 - iter 2976/3726 - loss 0.00512330 - time (sec): 364.67 - samples/sec: 450.17 - lr: 0.000000 - momentum: 0.000000 2023-10-27 17:17:40,016 epoch 10 - iter 3348/3726 - loss 0.00514967 - time (sec): 409.88 - samples/sec: 449.74 - lr: 0.000000 - momentum: 0.000000 2023-10-27 17:18:25,574 epoch 10 - iter 3720/3726 - loss 0.00505541 - time (sec): 455.44 - samples/sec: 448.55 - lr: 0.000000 - momentum: 0.000000 2023-10-27 17:18:26,331 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:18:26,331 EPOCH 10 done: loss 0.0051 - lr: 0.000000 2023-10-27 17:18:49,314 DEV : loss 0.05512790009379387 - f1-score (micro avg) 0.9722 2023-10-27 17:18:51,313 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:18:51,315 Loading model from best epoch ... 2023-10-27 17:18:58,497 SequenceTagger predicts: Dictionary with 17 tags: O, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-MISC, B-MISC, E-MISC, I-MISC 2023-10-27 17:19:21,159 Results: - F-score (micro) 0.969 - F-score (macro) 0.9632 - Accuracy 0.9558 By class: precision recall f1-score support ORG 0.9676 0.9691 0.9683 1909 PER 0.9956 0.9943 0.9950 1591 LOC 0.9756 0.9625 0.9690 1413 MISC 0.9019 0.9397 0.9204 812 micro avg 0.9676 0.9703 0.9690 5725 macro avg 0.9602 0.9664 0.9632 5725 weighted avg 0.9680 0.9703 0.9691 5725 2023-10-27 17:19:21,160 ----------------------------------------------------------------------------------------------------