MarinaraSpaghetti commited on
Commit
c15327a
1 Parent(s): a4e1f5e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +98 -59
README.md CHANGED
@@ -1,59 +1,98 @@
1
- ---
2
- base_model: []
3
- library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
- ---
9
- # NemoReRemix-12B
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
-
13
- ## Merge Details
14
- ### Merge Method
15
-
16
- This model was merged using the della_linear merge method using E:\mergekit\mistralaiMistral-Nemo-Base-2407 as a base.
17
-
18
- ### Models Merged
19
-
20
- The following models were included in the merge:
21
- * E:\mergekit\Sao10K_MN-12B-Lyra-v1
22
- * E:\mergekit\mistralaiMistral-Nemo-Instruct-2407
23
- * E:\mergekit\migtissera_Tess-3-Mistral-Nemo
24
- * E:\mergekit\shuttleai_shuttle-2.5-mini
25
- * E:\mergekit\anthracite-org_magnum-12b-v2
26
-
27
- ### Configuration
28
-
29
- The following YAML configuration was used to produce this model:
30
-
31
- ```yaml
32
- models:
33
- - model: E:\mergekit\mistralaiMistral-Nemo-Instruct-2407
34
- parameters:
35
- weight: 0.1
36
- density: 0.4
37
- - model: E:\mergekit\Sao10K_MN-12B-Lyra-v1
38
- parameters:
39
- weight: 0.12
40
- density: 0.5
41
- - model: E:\mergekit\shuttleai_shuttle-2.5-mini
42
- parameters:
43
- weight: 0.2
44
- density: 0.6
45
- - model: E:\mergekit\migtissera_Tess-3-Mistral-Nemo
46
- parameters:
47
- weight: 0.25
48
- density: 0.7
49
- - model: E:\mergekit\anthracite-org_magnum-12b-v2
50
- parameters:
51
- weight: 0.33
52
- density: 0.8
53
- merge_method: della_linear
54
- base_model: E:\mergekit\mistralaiMistral-Nemo-Base-2407
55
- parameters:
56
- epsilon: 0.05
57
- lambda: 1
58
- dtype: bfloat16
59
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: []
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
+ ---
8
+
9
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6550b16f7490049d6237f200/p0ZBFYc1RNoYcowv3Nj40.jpeg)
10
+
11
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6550b16f7490049d6237f200/peSTK513q1WOfRfRzWga4.png)
12
+
13
+ # Information
14
+ ## Details
15
+
16
+ Improved NemoRemix for storytelling and roleplay. Plus, this one can also be used as a general assistant model. The prose is pretty much the same, but it was made smarter, thanks to the addition of the amazing Migtissera's Tess model. I yeeted out Gryphe's Pantheon-RP, though, because it was trained with asterisks in mind, unlike the rest of the models in the merge, which caused it to mess the formatting from time to time; this one doesn't do that anymore. Hooray! All credits and thanks go to the amazing Migtissera, MistralAI, Anthracite, Sao10K and ShuttleAI for their amazing models.
17
+
18
+ ## Instruct
19
+
20
+ ChatML but Mistral Instruct should work too (theoretically).
21
+
22
+ ```
23
+ <|im_start|>system
24
+ {system}<|im_end|>
25
+ <|im_start|>user
26
+ {message}<|im_end|>
27
+ <|im_start|>assistant
28
+ {response}<|im_end|>
29
+ ```
30
+
31
+ ## Parameters
32
+
33
+ I recommend running Temperature 1.0-1.2 with 0.1 Top A or 0.01-0.1 Min P, and with 0.8/1.75/2/0 DRY. Also works with lower Temperatures below 1.0. Nothing more needed.
34
+
35
+ ### Settings
36
+
37
+ You can use my exact settings from here (use the ones from the ChatML Base/Customized folder): https://huggingface.co/MarinaraSpaghetti/SillyTavern-Settings/tree/main.
38
+
39
+ ## GGUF
40
+
41
+ https://huggingface.co/MarinaraSpaghetti/NemoReRemix-GGUF
42
+
43
+ # NemoReRemix-12B
44
+
45
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
46
+
47
+ ## Merge Details
48
+ ### Merge Method
49
+
50
+ This model was merged using the della_linear merge method using E:\mergekit\mistralaiMistral-Nemo-Base-2407 as a base.
51
+
52
+ ### Models Merged
53
+
54
+ The following models were included in the merge:
55
+ * E:\mergekit\Sao10K_MN-12B-Lyra-v1
56
+ * E:\mergekit\mistralaiMistral-Nemo-Instruct-2407
57
+ * E:\mergekit\migtissera_Tess-3-Mistral-Nemo
58
+ * E:\mergekit\shuttleai_shuttle-2.5-mini
59
+ * E:\mergekit\anthracite-org_magnum-12b-v2
60
+
61
+ ### Configuration
62
+
63
+ The following YAML configuration was used to produce this model:
64
+
65
+ ```yaml
66
+ models:
67
+ - model: E:\mergekit\mistralaiMistral-Nemo-Instruct-2407
68
+ parameters:
69
+ weight: 0.1
70
+ density: 0.4
71
+ - model: E:\mergekit\Sao10K_MN-12B-Lyra-v1
72
+ parameters:
73
+ weight: 0.12
74
+ density: 0.5
75
+ - model: E:\mergekit\shuttleai_shuttle-2.5-mini
76
+ parameters:
77
+ weight: 0.2
78
+ density: 0.6
79
+ - model: E:\mergekit\migtissera_Tess-3-Mistral-Nemo
80
+ parameters:
81
+ weight: 0.25
82
+ density: 0.7
83
+ - model: E:\mergekit\anthracite-org_magnum-12b-v2
84
+ parameters:
85
+ weight: 0.33
86
+ density: 0.8
87
+ merge_method: della_linear
88
+ base_model: E:\mergekit\mistralaiMistral-Nemo-Base-2407
89
+ parameters:
90
+ epsilon: 0.05
91
+ lambda: 1
92
+ dtype: bfloat16
93
+ ```
94
+
95
+ # Ko-fi
96
+ ## Enjoying what I do? Consider donating here, thank you!
97
+
98
+ https://ko-fi.com/spicy_marinara