Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
sail
's Collections
🧬 RegMix: Data Mixture as Regression
📈 Scaling Laws with Vocabulary
DICE
⚓️ Sailor Language Models
DICE
updated
4 days ago
Self-alignment with DPO Implicit Rewards
Upvote
6
Bootstrapping Language Models with DPO Implicit Rewards
Paper
•
2406.09760
•
Published
Jun 14
•
37
sail/Llama-3-Base-8B-DICE-Iter1
Text Generation
•
Updated
12 days ago
•
3
sail/Llama-3-Base-8B-DICE-Iter2
Text Generation
•
Updated
12 days ago
•
2
sail/Zephyr-7B-DICE-Iter1
Text Generation
•
Updated
12 days ago
•
1
sail/Zephyr-7B-DICE-Iter2
Text Generation
•
Updated
12 days ago
•
3
Upvote
6
+2
Share collection
View history
Collection guide
Browse collections