Thalirajesh commited on
Commit
24f1f64
1 Parent(s): 6626a6c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -4
README.md CHANGED
@@ -23,16 +23,36 @@ inference:
23
 
24
 
25
  ---
 
 
 
 
26
 
27
- This is a text generation model fine-tuned for creative writing tasks, such as essay writing and creative storytelling. The model is based on the GPT-2 architecture and has been trained on a diverse corpus of written works.
 
28
 
29
- During inference, the following parameters are used:
 
 
 
 
 
 
 
 
 
 
30
 
31
  - `max_length`: The maximum length of the generated text, set to 400 tokens.
32
  - `num_beams`: The number of beams for beam search, set to 10. A higher value will increase the diversity of the generated text but may also increase the inference time.
33
  - `early_stopping`: If set to True, the generation will stop as soon as the end-of-sequence token is generated.
34
- - `temperature`: The sampling temperature, set to 0.3.
35
  - `no_repeat_ngram_size`: The size of the n-gram window to avoid repetitions, set to 2.
36
 
37
 
38
- Please note that this is a language model, and the outputs may contain biases, inconsistencies, or potentially offensive content. It is recommended to review and filter the generated text as needed before using it in any real-world applications.
 
 
 
 
 
 
23
 
24
 
25
  ---
26
+ Introduction:
27
+ This repository contains a finetuned DistilChatGPT2 model for generating diverse essays on topics spanning Arts, Science, and Culture.
28
+
29
+ The model has been trained on a dataset of over 2000 high-quality essays written by human experts, covering a wide range of opinions and knowledge.
30
 
31
+ Dataset:
32
+ The training dataset comprises 2000+ essays covering diverse topics in Arts, Science, and Culture. These essays are written by human experts and contain a diverse set of opinions and knowledge, ensuring that the model learns from high-quality and diverse content.
33
 
34
+ Model Training:
35
+ - epoch: 50
36
+ - training_loss: 2.473200
37
+ - validation_loss: 4.569556
38
+ - perplexities: [517.4149169921875, 924.535888671875, 704.73291015625, 465.9677429199219, 577.629150390625, 443.994140625, 770.1861572265625, 683.028076171875, 1017.7510375976562, 880.795166015625]
39
+ - mean_perplexity: 698.603519
40
+
41
+ Description:
42
+ The model achieved a mean perplexity of 698.603519 on the validation set, indicating its ability to generate diverse and high-quality essays on the given topics.
43
+
44
+ During Text Generation, the following parameters are used:
45
 
46
  - `max_length`: The maximum length of the generated text, set to 400 tokens.
47
  - `num_beams`: The number of beams for beam search, set to 10. A higher value will increase the diversity of the generated text but may also increase the inference time.
48
  - `early_stopping`: If set to True, the generation will stop as soon as the end-of-sequence token is generated.
49
+ - `temperature`: The sampling temperature, is set to 0.3.
50
  - `no_repeat_ngram_size`: The size of the n-gram window to avoid repetitions, set to 2.
51
 
52
 
53
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64fec5de57ccb8f1bdfbec54/ac89INQ8czj1u6WApI20J.png)
54
+
55
+
56
+ Find the kaggle notebook for this project at
57
+
58
+ [Kaggle Notebook](https://www.kaggle.com/code/vignesharjunraj/finetuned-distilgpt2-llm-for-essays-400-words/)