v000000
/

L3.1-Celestial-Stone-2x8B-DPO

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

v000000 commited on 3 days ago

Commit

de054df

•

1 Parent(s): b6e9e72

Update README.md

Files changed (1) hide show

README.md +15 -6

README.md CHANGED Viewed

@@ -29,15 +29,20 @@ datasets:
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/lyRa7z5maTqAaa43sxC2J.png)
-* <b>Direct Preference Optimization run</b>
-*llama.cpp*
-# Thanks QuantFactory for the quants:
 * [GGUF static](https://huggingface.co/QuantFactory/L3.1-Celestial-Stone-2x8B-DPO-GGUF)
-# Thanks Triangle104 for the quants:
 * [Q8_0](https://huggingface.co/Triangle104/L3.1-Celestial-Stone-2x8B-DPO-Q8_0-GGUF)
 * [Q6_K](https://huggingface.co/Triangle104/L3.1-Celestial-Stone-2x8B-DPO-Q6_K-GGUF)
@@ -52,7 +57,9 @@ datasets:
 0.5 Epoch completed of dataset [jondurbin/gutenberg-dpo-v0.1](https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1) with learning_rate=8e-6
-Result seems pretty good. More compliant and verbose, less sloppy and safety aligned.
 ------------------------------------------------------------------------------
@@ -78,4 +85,6 @@ Result seems pretty good. More compliant and verbose, less sloppy and safety ali
 {output}<|eot_id|>
-```

 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/lyRa7z5maTqAaa43sxC2J.png)
+* <b>2x Experts working together per token, Gutenberg novelwriting finetuning.</b>
+*List of llama.cpp repos*
+# Thanks mradermacher (GGUF):
+* [GGUF static](https://huggingface.co/mradermacher/L3.1-Celestial-Stone-2x8B-DPO-GGUF)
+* [GGUF Imatrix](https://huggingface.co/mradermacher/L3.1-Celestial-Stone-2x8B-DPO-i1-GGUF)
+# Thanks QuantFactory (GGUF):
 * [GGUF static](https://huggingface.co/QuantFactory/L3.1-Celestial-Stone-2x8B-DPO-GGUF)
+# Thanks Triangle104 (GGUF):
 * [Q8_0](https://huggingface.co/Triangle104/L3.1-Celestial-Stone-2x8B-DPO-Q8_0-GGUF)
 * [Q6_K](https://huggingface.co/Triangle104/L3.1-Celestial-Stone-2x8B-DPO-Q6_K-GGUF)
 0.5 Epoch completed of dataset [jondurbin/gutenberg-dpo-v0.1](https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1) with learning_rate=8e-6
+Result seems pretty good even with half epoch and low learning rate, the effect is smoother and less pronounced.
+Outputs are more compliant and verbose, less sloppy and safety aligned.
 ------------------------------------------------------------------------------
 {output}<|eot_id|>
+```
+*For Llama.cpp/LMStudio/etc Make sure num_experts_used = 2*