Edit model card

pszemraj/pegasus-x-large-book-summary

Open In Colab

Get SparkNotes-esque summaries of arbitrary text! Due to the model size, it's recommended to try it out in Colab (linked above) as the API textbox may time out.

This model is a fine-tuned version of google/pegasus-x-large on the kmfoda/booksum dataset for approx eight epochs.

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

Epochs 1-4

TODO

Epochs 5 & 6

The following hyperparameters were used during training:

  • learning_rate: 6e-05
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: ADAN using lucidrains' adan-pytorch with default betas
  • lr_scheduler_type: constant_with_warmup
  • data type: TF32
  • num_epochs: 2

Epochs 7 & 8

  • epochs 5 & 6 were trained with 12288 tokens input
  • this fixes that with 2 epochs at 16384 tokens input

The following hyperparameters were used during training:

  • learning_rate: 0.0004
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: ADAN using lucidrains' adan-pytorch with default betas
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 2

Framework versions

  • Transformers 4.22.0
  • Pytorch 1.11.0a0+17540c5
  • Datasets 2.4.0
  • Tokenizers 0.12.1
Downloads last month
2,274
Safetensors
Model size
569M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Dataset used to train pszemraj/pegasus-x-large-book-summary

Spaces using pszemraj/pegasus-x-large-book-summary 8

Collection including pszemraj/pegasus-x-large-book-summary

Evaluation results