v000000
/

L3.1-Celestial-Stone-2x8B-DPO

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

v000000 commited on 3 days ago

Commit

d063352

•

1 Parent(s): 201d517

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -57,7 +57,7 @@ datasets:
 0.5 Epoch completed of dataset [jondurbin/gutenberg-dpo-v0.1](https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1) with learning_rate=8e-6
-Result seems pretty good even with half epoch and low learning rate, the effect is smoother and less pronounced.
 Outputs are more compliant and verbose, less sloppy and safety aligned.

 0.5 Epoch completed of dataset [jondurbin/gutenberg-dpo-v0.1](https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1) with learning_rate=8e-6
+Result seems pretty good even with half epoch and low learning rate, the effect is smoother and less pronounced but its probably not *optimal*.
 Outputs are more compliant and verbose, less sloppy and safety aligned.