stulcrad
/

fine_tuned_BERT_cs_wikann

@@ -17,11 +17,11 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) on the wikiann dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1618
-- Overall Accuracy: 0.9672
-- Overall F1: 0.9184
-- Overall Precision: 0.9155
-- Overall Recall: 0.9213
 ## Model description
@@ -40,9 +40,9 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 5e-05
-- train_batch_size: 8
-- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
@@ -50,23 +50,15 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Overall Accuracy | Overall F1 | Overall Precision | Overall Recall |
-|:-------------:|:-----:|:----:|:---------------:|:----------------:|:----------:|:-----------------:|:--------------:|
-| 0.3538        | 0.2   | 500  | 0.2330          | 0.9392           | 0.8365     | 0.8271            | 0.8461         |
-| 0.2331        | 0.4   | 1000 | 0.2291          | 0.9429           | 0.8536     | 0.8442            | 0.8633         |
-| 0.2093        | 0.6   | 1500 | 0.1936          | 0.9515           | 0.8720     | 0.8777            | 0.8663         |
-| 0.1976        | 0.8   | 2000 | 0.1728          | 0.9512           | 0.8673     | 0.8634            | 0.8714         |
-| 0.1911        | 1.0   | 2500 | 0.1811          | 0.9586           | 0.8911     | 0.8797            | 0.9027         |
-| 0.1245        | 1.2   | 3000 | 0.1771          | 0.9604           | 0.8977     | 0.8933            | 0.9022         |
-| 0.1219        | 1.4   | 3500 | 0.1731          | 0.9595           | 0.8965     | 0.8893            | 0.9039         |
-| 0.1102        | 1.6   | 4000 | 0.1721          | 0.9625           | 0.9060     | 0.9041            | 0.9078         |
-| 0.1203        | 1.8   | 4500 | 0.1538          | 0.9625           | 0.9038     | 0.9095            | 0.8981         |
-| 0.1105        | 2.0   | 5000 | 0.1562          | 0.9656           | 0.9120     | 0.9065            | 0.9177         |
-| 0.0601        | 2.2   | 5500 | 0.1700          | 0.9648           | 0.9113     | 0.9006            | 0.9222         |
-| 0.0579        | 2.4   | 6000 | 0.1569          | 0.9659           | 0.9140     | 0.9105            | 0.9176         |
-| 0.0571        | 2.6   | 6500 | 0.1595          | 0.9673           | 0.9168     | 0.9154            | 0.9183         |
-| 0.0504        | 2.8   | 7000 | 0.1664          | 0.9670           | 0.9174     | 0.9120            | 0.9228         |
-| 0.0588        | 3.0   | 7500 | 0.1618          | 0.9672           | 0.9184     | 0.9155            | 0.9213         |
 ### Framework versions

 This model is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) on the wikiann dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1428
+- Overall Precision: 0.9090
+- Overall Recall: 0.9274
+- Overall F1: 0.9181
+- Overall Accuracy: 0.9673
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 16
+- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:-----------------:|:--------------:|:----------:|:----------------:|
+| 0.3011        | 0.4   | 500  | 0.1781          | 0.8588            | 0.8721         | 0.8654     | 0.9501           |
+| 0.1717        | 0.8   | 1000 | 0.1524          | 0.8733            | 0.9033         | 0.8880     | 0.9565           |
+| 0.1307        | 1.2   | 1500 | 0.1443          | 0.9058            | 0.9051         | 0.9054     | 0.9639           |
+| 0.0968        | 1.6   | 2000 | 0.1392          | 0.9075            | 0.9107         | 0.9091     | 0.9651           |
+| 0.0974        | 2.0   | 2500 | 0.1352          | 0.9030            | 0.9201         | 0.9115     | 0.9647           |
+| 0.0603        | 2.4   | 3000 | 0.1410          | 0.9091            | 0.9217         | 0.9154     | 0.9667           |
+| 0.054         | 2.8   | 3500 | 0.1428          | 0.9090            | 0.9274         | 0.9181     | 0.9673           |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:22ef0bc59ef9a3abcec993c0bbb6da0af77b2e1b764f2265f163c1398ba96be9
 size 709096284

 version https://git-lfs.github.com/spec/v1
+oid sha256:a96cd56f2f8342b2e54665696bb3db5926ca535340925dc1918da415309162b7
 size 709096284

runs/Dec13_00-09-53_n26/events.out.tfevents.1702422595.n26.74777.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5f018df7f33b92e60c688cd3b27b5369b87a5fe521d88fb2d40cfdf754712010
-size 9344

 version https://git-lfs.github.com/spec/v1
+oid sha256:6ad3fbbcff242c3dc13c9d395e65460e878d0d087a850783aef61bdbfc2e4292
+size 9698