mradermacher
/

model_requests

Model card Files Files and versions Community

mradermacher commited on Apr 23

Commit

bb67c97

•

1 Parent(s): 798ca52

Update README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -113,6 +113,13 @@ Through a combination of these ingenuous tricks:
 The few evaluations I have suggests that this gives good quality, and my current set-up allows me to
 generate imatrix data for most models in fp16, 70B in Q8_0 and almost everything else in Q4_K_S.
 ## Why don't you use gguf-split?
 TL;DR: I don't have the hardware/resources for that.

 The few evaluations I have suggests that this gives good quality, and my current set-up allows me to
 generate imatrix data for most models in fp16, 70B in Q8_0 and almost everything else in Q4_K_S.
+The trick to 3 is not actually having patience, the trick is to automate things to the point where you
+don't have to wait for things normally. For example, if all goes well, quantizing a model requires just
+a single command (or less) for static quants, and for imatrix quants I need to select the source gguf
+and then run another command which handles download/computation/upload. Most of the time, I only have
+to do stuff when things go wrong (which, with llama.cpp being so buggy and hard to use,
+is unfortunately very frequent).
 ## Why don't you use gguf-split?
 TL;DR: I don't have the hardware/resources for that.