v000000 commited on
Commit
de054df
1 Parent(s): b6e9e72

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -6
README.md CHANGED
@@ -29,15 +29,20 @@ datasets:
29
 
30
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/lyRa7z5maTqAaa43sxC2J.png)
31
 
32
- * <b>Direct Preference Optimization run</b>
33
 
34
- *llama.cpp*
35
 
36
- # Thanks QuantFactory for the quants:
 
 
 
 
 
37
 
38
  * [GGUF static](https://huggingface.co/QuantFactory/L3.1-Celestial-Stone-2x8B-DPO-GGUF)
39
 
40
- # Thanks Triangle104 for the quants:
41
 
42
  * [Q8_0](https://huggingface.co/Triangle104/L3.1-Celestial-Stone-2x8B-DPO-Q8_0-GGUF)
43
  * [Q6_K](https://huggingface.co/Triangle104/L3.1-Celestial-Stone-2x8B-DPO-Q6_K-GGUF)
@@ -52,7 +57,9 @@ datasets:
52
 
53
  0.5 Epoch completed of dataset [jondurbin/gutenberg-dpo-v0.1](https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1) with learning_rate=8e-6
54
 
55
- Result seems pretty good. More compliant and verbose, less sloppy and safety aligned.
 
 
56
 
57
  ------------------------------------------------------------------------------
58
 
@@ -78,4 +85,6 @@ Result seems pretty good. More compliant and verbose, less sloppy and safety ali
78
 
79
  {output}<|eot_id|>
80
 
81
- ```
 
 
 
29
 
30
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/lyRa7z5maTqAaa43sxC2J.png)
31
 
32
+ * <b>2x Experts working together per token, Gutenberg novelwriting finetuning.</b>
33
 
34
+ *List of llama.cpp repos*
35
 
36
+ # Thanks mradermacher (GGUF):
37
+
38
+ * [GGUF static](https://huggingface.co/mradermacher/L3.1-Celestial-Stone-2x8B-DPO-GGUF)
39
+ * [GGUF Imatrix](https://huggingface.co/mradermacher/L3.1-Celestial-Stone-2x8B-DPO-i1-GGUF)
40
+
41
+ # Thanks QuantFactory (GGUF):
42
 
43
  * [GGUF static](https://huggingface.co/QuantFactory/L3.1-Celestial-Stone-2x8B-DPO-GGUF)
44
 
45
+ # Thanks Triangle104 (GGUF):
46
 
47
  * [Q8_0](https://huggingface.co/Triangle104/L3.1-Celestial-Stone-2x8B-DPO-Q8_0-GGUF)
48
  * [Q6_K](https://huggingface.co/Triangle104/L3.1-Celestial-Stone-2x8B-DPO-Q6_K-GGUF)
 
57
 
58
  0.5 Epoch completed of dataset [jondurbin/gutenberg-dpo-v0.1](https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1) with learning_rate=8e-6
59
 
60
+ Result seems pretty good even with half epoch and low learning rate, the effect is smoother and less pronounced.
61
+
62
+ Outputs are more compliant and verbose, less sloppy and safety aligned.
63
 
64
  ------------------------------------------------------------------------------
65
 
 
85
 
86
  {output}<|eot_id|>
87
 
88
+ ```
89
+
90
+ *For Llama.cpp/LMStudio/etc Make sure num_experts_used = 2*