PEFT
Safetensors
qwen2
alignment-handbook
trl
dpo
Generated from Trainer
File size: 232 Bytes
e965a13
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
{
    "epoch": 0.9995419147961521,
    "total_flos": 0.0,
    "train_loss": 0.4306906612229719,
    "train_runtime": 8689.0202,
    "train_samples": 34924,
    "train_samples_per_second": 4.019,
    "train_steps_per_second": 0.126
}