mlabonne commited on
Commit
d2512d4
1 Parent(s): b6c2fce

Create laserRMT.log.

Browse files
Files changed (1) hide show
  1. laserRMT.log. +128 -0
laserRMT.log. ADDED
@@ -0,0 +1,128 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Downloading shards: 100% 3/3 [00:41<00:00, 13.87s/it]
2
+ Loading checkpoint shards: 100% 3/3 [00:07<00:00, 2.53s/it]
3
+ generation_config.json: 100% 115/115 [00:00<00:00, 575kB/s]
4
+ tokenizer_config.json: 100% 1.60k/1.60k [00:00<00:00, 8.48MB/s]
5
+ tokenizer.model: 100% 493k/493k [00:00<00:00, 22.9MB/s]
6
+ tokenizer.json: 100% 1.80M/1.80M [00:00<00:00, 7.43MB/s]
7
+ added_tokens.json: 100% 51.0/51.0 [00:00<00:00, 283kB/s]
8
+ special_tokens_map.json: 100% 420/420 [00:00<00:00, 1.74MB/s]
9
+ Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
10
+ Reconstructing layer: model.layers.25.mlp.down_proj
11
+ Reduced from torch.Size([4096]) to 3607
12
+ Layer mlp.down_proj_25 has already been modified. Skipping.
13
+ Restored original weights for layer: model.layers.25.mlp.down_proj
14
+ Reconstructing layer: model.layers.25.mlp.down_proj
15
+ Reduced from torch.Size([4096]) to 3607
16
+ Restored original weights for layer: model.layers.25.mlp.down_proj
17
+ ['.31.', '.30.', '.29.', '.28.', '.27.', '.26.', '.25.', '.24.', '.23.', '.22.', '.21.', '.20.', '.19.', '.18.', '.17.', '.16.', '.15.', '.14.', '.13.', '.12.', '.11.', '.10.', '.9.', '.8.', '.7.', '.6.', '.5.', '.4.', '.3.', '.2.', '.1.', '.0.']
18
+ Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
19
+ avg_loss = 2.1474520114478235: 100% 871/871 [00:46<00:00, 18.55it/s]
20
+ /usr/local/lib/python3.10/dist-packages/huggingface_hub/repocard.py:105: UserWarning: Repo card metadata block was not found. Setting CardData to empty.
21
+ warnings.warn("Repo card metadata block was not found. Setting CardData to empty.")
22
+ Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
23
+ avg_loss = 9.703152929898351: 100% 256/256 [00:13<00:00, 18.83it/s]
24
+ Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
25
+ avg_loss = 13.355979550516967: 100% 264/264 [00:14<00:00, 18.66it/s]
26
+ ==================================================
27
+ The initial perplexity of the model is 12.614558219909668
28
+ ==================================================
29
+ Reconstructing layer: model.layers.31.mlp.down_proj
30
+ Reduced from torch.Size([4096]) to 3753
31
+ avg_loss = 2.150142833641832: 100% 871/871 [00:46<00:00, 18.75it/s]
32
+ avg_loss = 9.714343913365155: 100% 256/256 [00:13<00:00, 18.74it/s]
33
+ avg_loss = 13.374103391260812: 100% 264/264 [00:14<00:00, 18.43it/s]
34
+ Restored original weights for layer: model.layers.31.mlp.down_proj
35
+ Reconstructing layer: model.layers.31.mlp.up_proj
36
+ Reduced from torch.Size([4096]) to 3717
37
+ avg_loss = 2.1734046262660063: 100% 871/871 [00:46<00:00, 18.57it/s]
38
+ avg_loss = 9.82143080001697: 100% 256/256 [00:13<00:00, 18.57it/s]
39
+ avg_loss = 13.477815985228077: 100% 264/264 [00:14<00:00, 18.20it/s]
40
+ Restored original weights for layer: model.layers.31.mlp.up_proj
41
+ Reconstructing layer: model.layers.31.self_attn.q_proj
42
+ Reduced from torch.Size([4096]) to 818
43
+ avg_loss = 2.148138916040808: 100% 871/871 [00:46<00:00, 18.53it/s]
44
+ avg_loss = 9.705221582669765: 100% 256/256 [00:13<00:00, 18.62it/s]
45
+ avg_loss = 13.35540055280382: 100% 264/264 [00:14<00:00, 18.71it/s]
46
+ **************************************************
47
+ Improved perplexity found: 12.613171577453613 for layer self_attn.q_proj .31.. Total modifications is 1
48
+ **************************************************
49
+ Reconstructing layer: model.layers.31.self_attn.k_proj
50
+ Reduced from torch.Size([1024]) to 524
51
+ avg_loss = 2.1553964071514686: 100% 871/871 [00:46<00:00, 18.71it/s]
52
+ avg_loss = 9.734999645967036: 100% 256/256 [00:13<00:00, 18.84it/s]
53
+ avg_loss = 13.383289175954731: 100% 264/264 [00:14<00:00, 18.51it/s]
54
+ Restored original weights for layer: model.layers.31.self_attn.k_proj
55
+ Reconstructing layer: model.layers.31.self_attn.v_proj
56
+ Reduced from torch.Size([1024]) to 846
57
+ avg_loss = 2.1430855287339465: 100% 871/871 [00:46<00:00, 18.78it/s]
58
+ avg_loss = 9.666598222218454: 100% 256/256 [00:13<00:00, 18.74it/s]
59
+ avg_loss = 13.313674368641593: 100% 264/264 [00:14<00:00, 18.69it/s]
60
+ **************************************************
61
+ Improved perplexity found: 12.513681411743164 for layer self_attn.v_proj .31.. Total modifications is 2
62
+ **************************************************
63
+ Reconstructing layer: model.layers.31.self_attn.o_proj
64
+ Reduced from torch.Size([4096]) to 834
65
+ avg_loss = 2.1483869746960402: 100% 871/871 [00:47<00:00, 18.46it/s]
66
+ avg_loss = 9.686229056213051: 100% 256/256 [00:13<00:00, 18.78it/s]
67
+ avg_loss = 13.344844787861362: 100% 264/264 [00:14<00:00, 18.56it/s]
68
+ Restored original weights for layer: model.layers.31.self_attn.o_proj
69
+ Reconstructing layer: model.layers.30.mlp.down_proj
70
+ Reduced from torch.Size([4096]) to 3770
71
+ avg_loss = 2.1505854418576105: 100% 871/871 [00:47<00:00, 18.34it/s]
72
+ avg_loss = 9.6962159560062: 100% 256/256 [00:13<00:00, 18.63it/s]
73
+ avg_loss = 13.353956826256983: 100% 264/264 [00:14<00:00, 18.49it/s]
74
+ Restored original weights for layer: model.layers.30.mlp.down_proj
75
+ Reconstructing layer: model.layers.30.mlp.up_proj
76
+ Reduced from torch.Size([4096]) to 3787
77
+ avg_loss = 2.148582770547965: 100% 871/871 [00:47<00:00, 18.34it/s]
78
+ avg_loss = 9.686316559556872: 100% 256/256 [00:13<00:00, 18.59it/s]
79
+ avg_loss = 13.34067751738158: 100% 264/264 [00:14<00:00, 18.81it/s]
80
+ Restored original weights for layer: model.layers.30.mlp.up_proj
81
+ Reconstructing layer: model.layers.30.self_attn.q_proj
82
+ Reduced from torch.Size([4096]) to 819
83
+ avg_loss = 2.1425534111760927: 100% 871/871 [00:47<00:00, 18.40it/s]
84
+ avg_loss = 9.664284548722208: 100% 256/256 [00:13<00:00, 18.49it/s]
85
+ avg_loss = 13.309857179721197: 100% 264/264 [00:14<00:00, 18.63it/s]
86
+ **************************************************
87
+ Improved perplexity found: 12.504617691040039 for layer self_attn.q_proj .30.. Total modifications is 3
88
+ **************************************************
89
+ Reconstructing layer: model.layers.30.self_attn.k_proj
90
+ Reduced from torch.Size([1024]) to 524
91
+ avg_loss = 2.1449567824088884: 100% 871/871 [00:47<00:00, 18.51it/s]
92
+ avg_loss = 9.675114367622882: 100% 256/256 [00:13<00:00, 18.56it/s]
93
+ avg_loss = 13.32237600783507: 100% 264/264 [00:14<00:00, 18.72it/s]
94
+ Restored original weights for layer: model.layers.30.self_attn.k_proj
95
+ Reconstructing layer: model.layers.30.self_attn.v_proj
96
+ Reduced from torch.Size([1024]) to 812
97
+ avg_loss = 2.155356107294628: 100% 871/871 [00:47<00:00, 18.48it/s]
98
+ avg_loss = 9.7138080005534: 100% 256/256 [00:13<00:00, 18.37it/s]
99
+ avg_loss = 13.366635067444859: 100% 264/264 [00:14<00:00, 18.33it/s]
100
+ Restored original weights for layer: model.layers.30.self_attn.v_proj
101
+ Reconstructing layer: model.layers.30.self_attn.o_proj
102
+ Reduced from torch.Size([4096]) to 859
103
+ avg_loss = 2.146158002821641: 100% 871/871 [00:47<00:00, 18.33it/s]
104
+ avg_loss = 9.676836102735251: 100% 256/256 [00:13<00:00, 18.43it/s]
105
+ avg_loss = 13.318221795287998: 100% 264/264 [00:14<00:00, 18.33it/s]
106
+ Restored original weights for layer: model.layers.30.self_attn.o_proj
107
+ Reconstructing layer: model.layers.29.mlp.down_proj
108
+ Reduced from torch.Size([4096]) to 3763
109
+ avg_loss = 2.1450509054652587: 100% 871/871 [00:47<00:00, 18.35it/s]
110
+ avg_loss = 9.6743658403866: 100% 256/256 [00:14<00:00, 18.21it/s]
111
+ avg_loss = 13.321742536895202: 100% 264/264 [00:14<00:00, 18.19it/s]
112
+ Restored original weights for layer: model.layers.29.mlp.down_proj
113
+ Reconstructing layer: model.layers.29.mlp.up_proj
114
+ Reduced from torch.Size([4096]) to 3828
115
+ avg_loss = 2.1408350525165125: 100% 871/871 [00:47<00:00, 18.21it/s]
116
+ avg_loss = 9.65894997306168: 100% 256/256 [00:14<00:00, 18.26it/s]
117
+ avg_loss = 13.306687997146087: 100% 264/264 [00:14<00:00, 18.31it/s]
118
+ **************************************************
119
+ Improved perplexity found: 12.497097969055176 for layer mlp.up_proj .29.. Total modifications is 4
120
+ **************************************************
121
+ Reconstructing layer: model.layers.29.self_attn.q_proj
122
+ Reduced from torch.Size([4096]) to 803
123
+ avg_loss = 2.1367383972238043: 100% 871/871 [00:47<00:00, 18.18it/s]
124
+ avg_loss = 9.641230288892984: 100% 256/256 [00:13<00:00, 18.36it/s]
125
+ avg_loss = 13.289274643767964: 100% 264/264 [00:14<00:00, 18.47it/s]
126
+ **************************************************
127
+ Improved perplexity found: 12.455863952636719 for layer self_attn.q_proj .29.. Total modifications is 5
128
+ **************************************************