---
license: apache-2.0
datasets:
- teknium/GPT4-LLM-Cleaned
---
## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 4
- eval_batch_size: 4
- seed: 8
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 16
- total_eval_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 50
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 11.2812       | 0.0   | 1    | 11.5156         |
| 5.0938        | 0.2   | 62   | 5.1016          |
| 3.5703        | 0.4   | 124  | 3.7161          |
| 2.582         | 0.6   | 186  | 2.9010          |
| 2.2109        | 0.8   | 248  | 2.5156          |
| 1.9824        | 1.0   | 310  | 2.3477          |
| 1.8594        | 1.18  | 372  | 2.1960          |
| 1.748         | 1.38  | 434  | 2.1667          |
| 1.748         | 1.58  | 496  | 2.0195          |
| 1.7617        | 1.78  | 558  | 2.0749          |
| 1.6582        | 1.98  | 620  | 1.9095          |
| 1.5762        | 2.16  | 682  | 1.9036          |
| 1.5586        | 2.36  | 744  | 1.8457          |
| 1.6016        | 2.56  | 806  | 1.8112          |
| 1.5195        | 2.76  | 868  | 1.8034          |
| 1.5645        | 2.96  | 930  | 1.7773          |
| 1.457         | 3.14  | 992  | 1.7474          |
| 1.4883        | 3.34  | 1054 | 1.7467          |
| 1.4648        | 3.54  | 1116 | 1.7676          |
| 1.5195        | 3.74  | 1178 | 1.7383          |
| 1.4531        | 3.94  | 1240 | 1.7383          |
| 1.4648        | 4.12  | 1302 | 1.7181          |
| 1.4121        | 4.32  | 1364 | 1.7272          |
| 1.4727        | 4.52  | 1426 | 1.7259          |
| 1.4219        | 4.72  | 1488 | 1.7240          |
| 1.5137        | 4.92  | 1550 | 1.7227          |


### Framework versions

- Transformers 4.37.0.dev0
- Pytorch 2.1.2+cu121
- Datasets 2.15.0
- Tokenizers 0.15.0

[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)