Edit model card

paligemma-adapter-new

This model is a fine-tuned version of google/paligemma-3b-pt-224 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9112

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
1.1672 0.9997 1682 1.1611
1.033 2.0 3365 1.0390
0.9638 2.9997 5047 0.9879
0.9611 4.0 6730 0.9610
0.9409 4.9997 8412 0.9443
0.9136 6.0 10095 0.9344
0.9081 6.9997 11777 0.9271
0.9128 8.0 13460 0.9226
0.8958 8.9997 15142 0.9195
0.91 10.0 16825 0.9177
0.9061 10.9997 18507 0.9157
0.9013 12.0 20190 0.9144
0.9005 12.9997 21872 0.9137
0.8874 14.0 23555 0.9130
0.9176 14.9997 25237 0.9127
0.8866 16.0 26920 0.9125
0.8978 16.9997 28602 0.9119
0.892 18.0 30285 0.9117
0.8945 18.9997 31967 0.9116
0.8908 20.0 33650 0.9115
0.8837 20.9997 35332 0.9115
0.8957 22.0 37015 0.9112
0.8887 22.9997 38697 0.9114
0.8962 24.0 40380 0.9114
0.899 24.9997 42062 0.9114
0.9024 26.0 43745 0.9112
0.8873 26.9997 45427 0.9112
0.9049 28.0 47110 0.9111
0.8953 28.9997 48792 0.9113
0.8929 30.0 50475 0.9112
0.9003 30.9997 52157 0.9111
0.8913 32.0 53840 0.9112
0.8934 32.9997 55522 0.9111
0.9022 34.0 57205 0.9112
0.8935 34.9997 58887 0.9112
0.8994 36.0 60570 0.9112
0.894 36.9997 62252 0.9112
0.8938 38.0 63935 0.9112
0.8985 38.9997 65617 0.9112
0.9013 40.0 67300 0.9111
0.9023 40.9997 68982 0.9111
0.9065 42.0 70665 0.9110
0.9045 42.9997 72347 0.9111
0.9013 44.0 74030 0.9112
0.8855 44.9997 75712 0.9112
0.8864 46.0 77395 0.9110
0.9026 46.9997 79077 0.9112
0.8979 48.0 80760 0.9111
0.9066 48.9997 82442 0.9111
0.896 49.9851 84100 0.9112

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.0+cu118
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
89
Safetensors
Model size
2.92B params
Tensor type
BF16
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for gokulsabari/paligemma-adapter-new

Finetuned
(29)
this model