Edit model card

Orpo-GutenLlama-3-8B-v2

Training Params

  • Learning Rate: 8e-6
  • Batch Size: 1
  • Eval Batch size: 1
  • Gradient accumulation steps: 4
  • Epochs: 3
  • Training Loss: 0.88

Training time: 4 hours on 1x4090. This is a small 1800 sample fine tune to get comfortable with ORPO fine tuning before scaling up.

image/png

Downloads last month
4
Safetensors
Model size
8.03B params
Tensor type
FP16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for macadeliccc/Orpo-GutenLlama-3-8B-v2

Quantizations
2 models

Datasets used to train macadeliccc/Orpo-GutenLlama-3-8B-v2