What datasets were these trained on?

by rombodawg - opened

Can you add the datasets this model was trained on to the model card so people know? Id like to know what kind of model this is in general

+1 … what is SMAUG?

SMAUG is the dragon from Lord of the Rings, I believe. If you've read Tolkien. ))

We've now updated to include the datasets that this model was trained on. It still will have many of the qualities of Meta-Llama, but we have tried to improve its reasoning, math and coding skills in particular in this finetune.

More information on the exact technique/data will be released later on. For now, see the previous Smaug paper: https://arxiv.org/abs/2402.13228.

Hello, the DPOP method proposed in Smaug paper is based on preference datasets. However, the datasets provided in the model card are SFT datasets. I was wondering how to convert the provided SFT datasets to preference datasets. Maybe sampling from Llama-3-8B-instruct and using a reward model for rewarding?

Sign up or log in to comment