Edit model card

Introduction: This repository contains a finetuned DistilGPT2 model for generating diverse essays on topics spanning Arts, Science, and Culture.

Dataset: The training dataset comprises 2000+ essays covering diverse topics in Arts, Science, and Culture. These essays are written by human experts and contain a diverse set of opinions and knowledge, ensuring that the model learns from high-quality and diverse content.

Model Training:

  • epoch: 50
  • training_loss: 2.473200
  • validation_loss: 4.569556
  • perplexities: [517.4149169921875, 924.535888671875, 704.73291015625, 465.9677429199219, 577.629150390625, 443.994140625, 770.1861572265625, 683.028076171875, 1017.7510375976562, 880.795166015625]
  • mean_perplexity: 698.603519

Description: The model achieved a mean perplexity of 698.603519 on the validation set, indicating its ability to generate diverse and high-quality essays on the given topics.

During Text Generation, the following parameters are used:

  • max_length: The maximum length of the generated text, set to 400 tokens.
  • num_beams: The number of beams for beam search, set to 10. A higher value will increase the diversity of the generated text but may also increase the inference time.
  • early_stopping: If set to True, the generation will stop as soon as the end-of-sequence token is generated.
  • temperature: The sampling temperature, is set to 0.3.
  • no_repeat_ngram_size: The size of the n-gram window to avoid repetitions, set to 2.

image/png

Find the kaggle notebook for this project at

Kaggle Notebook

Downloads last month
8
Safetensors
Model size
81.9M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.