thrunlab
/

t5-base_sst2_dense_epochs-8

+---
+license: apache-2.0
+base_model: t5-base
+tags:
+- generated_from_trainer
+datasets:
+- glue
+metrics:
+- accuracy
+model-index:
+- name: t5-base_sst2_dense_epochs-8
+  results:
+  - task:
+      name: Text Classification
+      type: text-classification
+    dataset:
+      name: glue
+      type: glue
+      config: sst2
+      split: validation
+      args: sst2
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 0.9346330275229358
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# t5-base_sst2_dense_epochs-8
+This model is a fine-tuned version of [t5-base](https://huggingface.co/t5-base) on the glue dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.2931
+- Accuracy: 0.9346
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 32
+- eval_batch_size: 64
+- seed: 0
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 20
+- num_epochs: 8
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 0.6552        | 0.02  | 50   | 0.6446          | 0.6193   |
+| 0.3237        | 0.05  | 100  | 0.2756          | 0.9071   |
+| 0.2725        | 0.07  | 150  | 0.2409          | 0.9151   |
+| 0.2353        | 0.1   | 200  | 0.2526          | 0.9128   |
+| 0.2342        | 0.12  | 250  | 0.2287          | 0.9174   |
+| 0.2635        | 0.14  | 300  | 0.2342          | 0.9220   |
+| 0.2534        | 0.17  | 350  | 0.2149          | 0.9255   |
+| 0.2402        | 0.19  | 400  | 0.2160          | 0.9255   |
+| 0.1857        | 0.21  | 450  | 0.2117          | 0.9243   |
+| 0.1696        | 0.24  | 500  | 0.3351          | 0.9266   |
+| 0.1504        | 0.26  | 550  | 0.2275          | 0.9209   |
+| 0.2849        | 0.29  | 600  | 0.2301          | 0.9255   |
+| 0.2336        | 0.31  | 650  | 0.2332          | 0.9220   |
+| 0.1587        | 0.33  | 700  | 0.2158          | 0.9243   |
+| 0.2645        | 0.36  | 750  | 0.2075          | 0.9300   |
+| 0.1809        | 0.38  | 800  | 0.2060          | 0.9255   |
+| 0.1088        | 0.4   | 850  | 0.3409          | 0.9255   |
+| 0.1623        | 0.43  | 900  | 0.3342          | 0.9289   |
+| 0.1987        | 0.45  | 950  | 0.2280          | 0.9278   |
+| 0.2622        | 0.48  | 1000 | 0.3327          | 0.9243   |
+| 0.1121        | 0.5   | 1050 | 0.3205          | 0.9289   |
+| 0.1831        | 0.52  | 1100 | 0.4233          | 0.9243   |
+| 0.2456        | 0.55  | 1150 | 0.5359          | 0.9335   |
+| 0.0938        | 0.57  | 1200 | 0.1931          | 0.9358   |
+| 0.1321        | 0.59  | 1250 | 0.4359          | 0.9323   |
+| 0.1478        | 0.62  | 1300 | 0.3059          | 0.9346   |
+| 0.1819        | 0.64  | 1350 | 0.4172          | 0.9358   |
+| 0.1178        | 0.67  | 1400 | 0.2997          | 0.9358   |
+| 0.1426        | 0.69  | 1450 | 0.5336          | 0.9346   |
+| 0.1033        | 0.71  | 1500 | 0.4292          | 0.9300   |
+| 0.1357        | 0.74  | 1550 | 0.4310          | 0.9369   |
+| 0.1668        | 0.76  | 1600 | 0.5359          | 0.9358   |
+| 0.1438        | 0.78  | 1650 | 0.3025          | 0.9381   |
+| 0.2141        | 0.81  | 1700 | 0.4265          | 0.9323   |
+| 0.0899        | 0.83  | 1750 | 0.4217          | 0.9323   |
+| 0.1062        | 0.86  | 1800 | 0.4377          | 0.9289   |
+| 0.1557        | 0.88  | 1850 | 0.3003          | 0.9323   |
+| 0.1237        | 0.9   | 1900 | 0.3134          | 0.9358   |
+| 0.1172        | 0.93  | 1950 | 0.3199          | 0.9312   |
+| 0.1617        | 0.95  | 2000 | 0.2931          | 0.9346   |
+| 0.1293        | 0.97  | 2050 | 0.2978          | 0.9381   |
+| 0.1686        | 1.0   | 2100 | 0.2885          | 0.9369   |
+| 0.7247        | 1.02  | 2150 | 0.7872          | 0.9300   |
+| 0.0679        | 1.05  | 2200 | 0.3114          | 0.9404   |
+| 0.0522        | 1.07  | 2250 | 0.2998          | 0.9346   |
+| 0.078         | 1.09  | 2300 | 0.3418          | 0.9358   |
+| 0.0749        | 1.12  | 2350 | 0.3248          | 0.9381   |
+| 0.0483        | 1.14  | 2400 | 0.4340          | 0.9369   |
+| 0.1534        | 1.16  | 2450 | 0.4428          | 0.9358   |
+| 0.1007        | 1.19  | 2500 | 0.4344          | 0.9369   |
+| 0.0655        | 1.21  | 2550 | 0.3215          | 0.9369   |
+| 0.074         | 1.24  | 2600 | 0.3182          | 0.9404   |
+### Framework versions
+- Transformers 4.34.1
+- Pytorch 2.0.1+cu117
+- Datasets 2.9.0
+- Tokenizers 0.14.1

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9a8945198a8a3e025504475be436258c25c4dcbd7f2b69ef96f754517782066b
 size 894094241

 version https://git-lfs.github.com/spec/v1
+oid sha256:e04f1d277b87df0e736c67b7e23234f55885b3ceb0780d500b5e7514d90136bf
 size 894094241