Kooten commited on
Commit
91f3a38
1 Parent(s): 3f35b83

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -0
README.md ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ tags:
4
+ - not-for-all-audiences
5
+ - nsfw
6
+ ---
7
+ ## Description
8
+
9
+ Exllama 2 quant of [Undi95/PsyMedRP-v1-20B ](https://huggingface.co/Undi95/PsyMedRP-v1-20B)
10
+
11
+ 3 BPW, Head bit set to 8
12
+
13
+
14
+ ## VRAM
15
+ My VRAM usage with 20B models are:
16
+ | Bits per weight | Context | VRAM |
17
+ |--|--|--|
18
+ | 6bpw | 4k | 24gb |
19
+ | 4bpw | 4k | 18gb |
20
+ | 4bpw | 8k | 24gb |
21
+ | 3bpw | 4k | 16gb |
22
+ | 3bpw | 8k | 21gb |
23
+ I have rounded up, these arent exact numbers, this is also on a windows machine.
24
+
25
+ ## Prompt template
26
+
27
+ [Recommended reading](https://huggingface.co/lemonilia/LimaRP-Llama2-13B-v3-EXPERIMENT)
28
+
29
+ You can follow these instruction format settings in SillyTavern. Replace `tiny` with
30
+ your desired response length:
31
+
32
+ ![settings](https://files.catbox.moe/6lcz0u.png)
33
+
34
+ ### Message length control
35
+ Inspired by the previously named "Roleplay" preset in SillyTavern, starting from this
36
+ version of LimaRP it is possible to append a length modifier to the response instruction
37
+ sequence, like this:
38
+
39
+ ```
40
+ ### Input
41
+ User: {utterance}
42
+
43
+ ### Response: (length = medium)
44
+ Character: {utterance}
45
+ ```
46
+
47
+ This has an immediately noticeable effect on bot responses. The available lengths are:
48
+ `tiny`, `short`, `medium`, `long`, `huge`, `humongous`, `extreme`, `unlimited`. **The
49
+ recommended starting length is `medium`**. Keep in mind that the AI may ramble
50
+ or impersonate the user with very long messages.
51
+
52
+ The length control effect is reproducible, but the messages will not necessarily follow
53
+ lengths very precisely, rather follow certain ranges on average, as seen in this table
54
+ with data from tests made with one reply at the beginning of the conversation:
55
+
56
+ ![lengths](https://files.catbox.moe/dy39bt.png)
57
+
58
+ Response length control appears to work well also deep into the conversation.