Edit model card

kosm-checkpoint

This model is a fine-tuned version of microsoft/kosmos-2-patch14-224 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0340

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss
No log 0.0497 200 0.0700
0.0802 0.0993 400 0.0581
0.0676 0.1490 600 0.0496
0.0584 0.1986 800 0.0450
0.0582 0.2483 1000 0.0481
0.0582 0.2979 1200 0.0486
0.0572 0.3476 1400 0.0445
0.0537 0.3972 1600 0.0463
0.0504 0.4469 1800 0.0421
0.0473 0.4965 2000 0.0402
0.0473 0.5462 2200 0.0423
0.046 0.5958 2400 0.0394
0.0448 0.6455 2600 0.0369
0.0423 0.6951 2800 0.0378
0.0403 0.7448 3000 0.0360
0.0403 0.7944 3200 0.0364
0.0392 0.8441 3400 0.0352
0.0388 0.8937 3600 0.0347
0.0375 0.9434 3800 0.0343
0.037 0.9930 4000 0.0345
0.037 1.0427 4200 0.0355
0.03 1.0924 4400 0.0338
0.0283 1.1420 4600 0.0349
0.0281 1.1917 4800 0.0347
0.0288 1.2413 5000 0.0322
0.0288 1.2910 5200 0.0331
0.0279 1.3406 5400 0.0335
0.0272 1.3903 5600 0.0322
0.0275 1.4399 5800 0.0338
0.0271 1.4896 6000 0.0324
0.0271 1.5392 6200 0.0324
0.0263 1.5889 6400 0.0320
0.0262 1.6385 6600 0.0319
0.0264 1.6882 6800 0.0317
0.0256 1.7378 7000 0.0322
0.0256 1.7875 7200 0.0320
0.0255 1.8371 7400 0.0316
0.0242 1.8868 7600 0.0327
0.0262 1.9364 7800 0.0307
0.0252 1.9861 8000 0.0304
0.0252 2.0357 8200 0.0343
0.0173 2.0854 8400 0.0373
0.0148 2.1351 8600 0.0345
0.015 2.1847 8800 0.0347
0.0148 2.2344 9000 0.0347
0.0148 2.2840 9200 0.0354
0.0132 2.3337 9400 0.0351
0.0136 2.3833 9600 0.0362
0.0132 2.4330 9800 0.0360
0.0138 2.4826 10000 0.0352
0.0138 2.5323 10200 0.0359
0.0138 2.5819 10400 0.0348
0.0132 2.6316 10600 0.0348
0.0129 2.6812 10800 0.0337
0.0134 2.7309 11000 0.0354
0.0134 2.7805 11200 0.0350
0.0132 2.8302 11400 0.0351
0.0128 2.8798 11600 0.0350
0.013 2.9295 11800 0.0339
0.012 2.9791 12000 0.0340

Framework versions

  • Transformers 4.42.4
  • Pytorch 2.1.2+cu121
  • Datasets 2.15.0
  • Tokenizers 0.19.1
Downloads last month
36
Safetensors
Model size
1.66B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for MoonstoneF/kosm-checkpoint

Finetuned
(2)
this model