mradermacher commited on
Commit
bb67c97
1 Parent(s): 798ca52

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -0
README.md CHANGED
@@ -113,6 +113,13 @@ Through a combination of these ingenuous tricks:
113
  The few evaluations I have suggests that this gives good quality, and my current set-up allows me to
114
  generate imatrix data for most models in fp16, 70B in Q8_0 and almost everything else in Q4_K_S.
115
 
 
 
 
 
 
 
 
116
  ## Why don't you use gguf-split?
117
 
118
  TL;DR: I don't have the hardware/resources for that.
 
113
  The few evaluations I have suggests that this gives good quality, and my current set-up allows me to
114
  generate imatrix data for most models in fp16, 70B in Q8_0 and almost everything else in Q4_K_S.
115
 
116
+ The trick to 3 is not actually having patience, the trick is to automate things to the point where you
117
+ don't have to wait for things normally. For example, if all goes well, quantizing a model requires just
118
+ a single command (or less) for static quants, and for imatrix quants I need to select the source gguf
119
+ and then run another command which handles download/computation/upload. Most of the time, I only have
120
+ to do stuff when things go wrong (which, with llama.cpp being so buggy and hard to use,
121
+ is unfortunately very frequent).
122
+
123
  ## Why don't you use gguf-split?
124
 
125
  TL;DR: I don't have the hardware/resources for that.