vxbrandon commited on
Commit
95c9e3c
1 Parent(s): 1e86de9

End of training

Browse files
README.md CHANGED
@@ -4,18 +4,18 @@ base_model: meta-llama/Llama-2-7b-hf
4
  tags:
5
  - generated_from_trainer
6
  model-index:
7
- - name: sparse_llama_7b_hf2_refined_web_50p_2024-05-11
8
  results: []
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
  should probably proofread and complete it, then remove this comment. -->
13
 
14
- # sparse_llama_7b_hf2_refined_web_50p_2024-05-11
15
 
16
  This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 2.2840
19
 
20
  ## Model description
21
 
@@ -43,26 +43,10 @@ The following hyperparameters were used during training:
43
  - total_train_batch_size: 8
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: linear
46
- - training_steps: 350
47
 
48
  ### Training results
49
 
50
- | Training Loss | Epoch | Step | Validation Loss |
51
- |:-------------:|:-----:|:----:|:---------------:|
52
- | 2.201 | 0.0 | 25 | 2.2172 |
53
- | 2.2379 | 0.0 | 50 | 2.2154 |
54
- | 2.1411 | 0.01 | 75 | 2.2137 |
55
- | 2.1523 | 0.01 | 100 | 2.2125 |
56
- | 2.5823 | 0.01 | 125 | 2.2103 |
57
- | 2.2672 | 0.01 | 150 | 2.2063 |
58
- | 2.3044 | 0.01 | 175 | 2.2036 |
59
- | 2.2119 | 0.02 | 200 | 2.2012 |
60
- | 2.1888 | 0.02 | 225 | 2.2004 |
61
- | 2.1592 | 0.02 | 250 | 2.1981 |
62
- | 2.2455 | 0.02 | 275 | 2.1972 |
63
- | 2.0666 | 0.02 | 300 | 2.1972 |
64
- | 2.322 | 0.03 | 325 | 2.1967 |
65
- | 2.2689 | 0.03 | 350 | 2.1946 |
66
 
67
 
68
  ### Framework versions
 
4
  tags:
5
  - generated_from_trainer
6
  model-index:
7
+ - name: sparse_llama_7b_hf2_refined_web_50p_2024-05-12
8
  results: []
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
  should probably proofread and complete it, then remove this comment. -->
13
 
14
+ # sparse_llama_7b_hf2_refined_web_50p_2024-05-12
15
 
16
  This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 2.3152
19
 
20
  ## Model description
21
 
 
43
  - total_train_batch_size: 8
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: linear
46
+ - training_steps: 10
47
 
48
  ### Training results
49
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
 
52
  ### Framework versions
model-00001-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1a4fd2a79d8ffbe808d2c2ea4d0bd40d42d0124b1ee869536fe03b85f39d8009
3
  size 4938985352
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c298abed023a26104e5746bd2c50b57ba3700f70b77ea4956a5c9fe5c99ec1ef
3
  size 4938985352
model-00002-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3ad5d08f9d9201c4efc073088ffaf7916d335b674dee7649ef51aa8d57046323
3
  size 4947390880
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dcb6b3532206e4fa19bff1a8b7b7359158fd4c73244833e28f216cda50508526
3
  size 4947390880
model-00003-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:34edc553f00682dc58972738c70149fbaef17576b1f28e9fad845e55f2857b71
3
  size 3590488816
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0d8fb8d6984a0a5ca91ca2071cdcc88092dc763034c47bd2c111f7ee5ac706cd
3
  size 3590488816