newbie-geek
/

tinyllama-v1-finetune

PEFT

Safetensors

Generated from Trainer

Model card Files Files and versions Community

newbie-geek commited on Feb 1

Commit

38121cb

•

1 Parent(s): e33f40e

End of training

Browse files

Files changed (1) hide show

README.md +141 -0

README.md ADDED Viewed

	@@ -0,0 +1,141 @@

+---
+license: apache-2.0
+library_name: peft
+tags:
+- generated_from_trainer
+base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
+model-index:
+- name: tinyllama-v1-finetune
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# tinyllama-v1-finetune
+This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.5205
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2.5e-05
+- train_batch_size: 2
+- eval_batch_size: 8
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- training_steps: 400
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 2.7664        | 0.04  | 5    | 2.7807          |
+| 2.7326        | 0.08  | 10   | 2.7249          |
+| 2.6955        | 0.12  | 15   | 2.6739          |
+| 2.643         | 0.16  | 20   | 2.6085          |
+| 2.6056        | 0.2   | 25   | 2.5554          |
+| 2.5018        | 0.24  | 30   | 2.4923          |
+| 2.4587        | 0.28  | 35   | 2.4204          |
+| 2.3997        | 0.32  | 40   | 2.3484          |
+| 2.3142        | 0.36  | 45   | 2.2673          |
+| 2.2507        | 0.4   | 50   | 2.1903          |
+| 2.1405        | 0.43  | 55   | 2.1071          |
+| 2.0471        | 0.47  | 60   | 2.0212          |
+| 1.9957        | 0.51  | 65   | 1.9385          |
+| 1.9007        | 0.55  | 70   | 1.8448          |
+| 1.8046        | 0.59  | 75   | 1.7501          |
+| 1.691         | 0.63  | 80   | 1.6495          |
+| 1.5957        | 0.67  | 85   | 1.5484          |
+| 1.53          | 0.71  | 90   | 1.4608          |
+| 1.4161        | 0.75  | 95   | 1.3669          |
+| 1.3409        | 0.79  | 100  | 1.2772          |
+| 1.239         | 0.83  | 105  | 1.1833          |
+| 1.1591        | 0.87  | 110  | 1.1122          |
+| 1.0729        | 0.91  | 115  | 1.0276          |
+| 1.0237        | 0.95  | 120  | 0.9615          |
+| 0.9293        | 0.99  | 125  | 0.9021          |
+| 0.891         | 1.03  | 130  | 0.8531          |
+| 0.8365        | 1.07  | 135  | 0.8155          |
+| 0.7876        | 1.11  | 140  | 0.7898          |
+| 0.7821        | 1.15  | 145  | 0.7669          |
+| 0.7392        | 1.19  | 150  | 0.7516          |
+| 0.77          | 1.23  | 155  | 0.7397          |
+| 0.7088        | 1.26  | 160  | 0.7226          |
+| 0.7246        | 1.3   | 165  | 0.7101          |
+| 0.7007        | 1.34  | 170  | 0.6960          |
+| 0.6667        | 1.38  | 175  | 0.6797          |
+| 0.6898        | 1.42  | 180  | 0.6666          |
+| 0.6608        | 1.46  | 185  | 0.6599          |
+| 0.6526        | 1.5   | 190  | 0.6451          |
+| 0.6078        | 1.54  | 195  | 0.6350          |
+| 0.6336        | 1.58  | 200  | 0.6248          |
+| 0.6074        | 1.62  | 205  | 0.6167          |
+| 0.6114        | 1.66  | 210  | 0.6131          |
+| 0.575         | 1.7   | 215  | 0.6041          |
+| 0.5933        | 1.74  | 220  | 0.5981          |
+| 0.5983        | 1.78  | 225  | 0.5910          |
+| 0.5907        | 1.82  | 230  | 0.5845          |
+| 0.5853        | 1.86  | 235  | 0.5801          |
+| 0.5881        | 1.9   | 240  | 0.5749          |
+| 0.5613        | 1.94  | 245  | 0.5700          |
+| 0.5852        | 1.98  | 250  | 0.5696          |
+| 0.5781        | 2.02  | 255  | 0.5652          |
+| 0.5812        | 2.06  | 260  | 0.5609          |
+| 0.5677        | 2.09  | 265  | 0.5576          |
+| 0.5544        | 2.13  | 270  | 0.5541          |
+| 0.536         | 2.17  | 275  | 0.5504          |
+| 0.5283        | 2.21  | 280  | 0.5487          |
+| 0.5326        | 2.25  | 285  | 0.5454          |
+| 0.568         | 2.29  | 290  | 0.5402          |
+| 0.5448        | 2.33  | 295  | 0.5395          |
+| 0.5581        | 2.37  | 300  | 0.5377          |
+| 0.5406        | 2.41  | 305  | 0.5355          |
+| 0.4996        | 2.45  | 310  | 0.5333          |
+| 0.5243        | 2.49  | 315  | 0.5346          |
+| 0.5591        | 2.53  | 320  | 0.5312          |
+| 0.5122        | 2.57  | 325  | 0.5297          |
+| 0.5426        | 2.61  | 330  | 0.5290          |
+| 0.4955        | 2.65  | 335  | 0.5290          |
+| 0.5531        | 2.69  | 340  | 0.5273          |
+| 0.5147        | 2.73  | 345  | 0.5278          |
+| 0.5195        | 2.77  | 350  | 0.5246          |
+| 0.5268        | 2.81  | 355  | 0.5254          |
+| 0.5284        | 2.85  | 360  | 0.5236          |
+| 0.5272        | 2.89  | 365  | 0.5224          |
+| 0.5053        | 2.92  | 370  | 0.5240          |
+| 0.528         | 2.96  | 375  | 0.5200          |
+| 0.517         | 3.0   | 380  | 0.5204          |
+| 0.5409        | 3.04  | 385  | 0.5206          |
+| 0.5204        | 3.08  | 390  | 0.5192          |
+| 0.52          | 3.12  | 395  | 0.5200          |
+| 0.5142        | 3.16  | 400  | 0.5205          |
+### Framework versions
+- PEFT 0.8.1
+- Transformers 4.37.2
+- Pytorch 2.1.0+cu121
+- Datasets 2.16.1
+- Tokenizers 0.15.1