datasciguy commited on
Commit
4801cca
1 Parent(s): 39576c2

added model card

Browse files
Files changed (1) hide show
  1. README.md +77 -3
README.md CHANGED
@@ -1,3 +1,77 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+ ### Model Card: **TinyLlama-1.1B-Chat-v1.0-Unfiltered**
5
+
6
+ ---
7
+
8
+ **Model Name**: TinyLlama-1.1B-Chat-v1.0-Unfiltered
9
+ **Model Type**: Conversational AI Model
10
+ **Architecture**: Based on a 1.1B parameter TinyLlama architecture
11
+
12
+ **Training Data**:
13
+ - Fine-tuned on the "dan_remixed" dataset (2.7MB).
14
+ - The dataset improves spelling, grammar, and consistency while replacing references to violent crimes with non-violent activities and removes self-censorship from explicatives.
15
+
16
+ **Training Time**: Approximately 30-45 minutes. Each validation epoch takes ~322 seconds.
17
+ **Hardware**: Trained on GPU (specific GPU details not provided).
18
+
19
+ ---
20
+
21
+ **Training Performance**:
22
+ - **Epoch Losses**:
23
+ - Epoch 1: 0.7209
24
+ - Epoch 2: 0.4441
25
+ - Epoch 3: 0.3683
26
+ - Epoch 4: 0.3358
27
+ - Epoch 5: 0.3145
28
+ - **Final Training Loss (Epoch 5)**: 0.3145
29
+
30
+ ---
31
+
32
+ **Validation Performance** (5 Epochs):
33
+ - **Epoch 1**:
34
+ - Training Loss: 0.2921
35
+ - Validation Loss: 0.7962
36
+ - Perplexity: 2.22
37
+ - Epoch completed in 321.64 seconds
38
+
39
+ - **Epoch 2**:
40
+ - Training Loss: 0.2872
41
+ - Validation Loss: 0.7672
42
+ - Perplexity: 2.15
43
+ - Epoch completed in 321.91 seconds
44
+
45
+ - **Epoch 3**:
46
+ - Training Loss: 0.2874
47
+ - Validation Loss: 0.7821
48
+ - Perplexity: 2.19
49
+ - Epoch completed in 321.94 seconds
50
+
51
+ - **Epoch 4**:
52
+ - Training Loss: 0.2864
53
+ - Validation Loss: 0.7796
54
+ - Perplexity: 2.18
55
+ - Epoch completed in 322.01 seconds
56
+
57
+ - **Epoch 5**:
58
+ - Training Loss: 0.2831
59
+ - Validation Loss: 0.8017
60
+ - Perplexity: 2.23
61
+ - Epoch completed in 322.01 seconds
62
+
63
+ ---
64
+
65
+ **Optimizer**: AdamW, learning rate: 1e-5
66
+ **Loss Function**: Cross-Entropy Loss, ignoring padding tokens (ignore_index=-100)
67
+ **Use Case**: Conversational AI designed for general, unrestricted conversation, with no filtering on the nature of responses, provided the content is non-violent.
68
+
69
+ ---
70
+
71
+ **Limitations**:
72
+ - Due to the small fine-tuning dataset size (2.7MB), the model may be prone to **overfitting** and **bias**.
73
+ - The dataset has been modified to avoid violent language, but the model might still exhibit strong or explicit responses.
74
+
75
+ **Metrics**:
76
+ - Loss and perplexity have been tracked, and more conversational metrics (like BLEU, ROUGE, or human evaluation) could be explored.
77
+