Update README.md
Browse files
README.md
CHANGED
@@ -6,13 +6,17 @@ This repo contains the weights of the Koala 13B model produced at Berkeley. It i
|
|
6 |
|
7 |
This version has then been quantized to 4-bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) and then converted to GGML for use with [llama.cpp](https://github.com/ggerganov/llama.cpp).
|
8 |
|
9 |
-
##
|
10 |
-
|
11 |
-
|
12 |
* [Unquantized 13B model in HF format](https://huggingface.co/TheBloke/koala-13B-HF)
|
13 |
-
* [GPTQ quantized 4bit
|
|
|
|
|
14 |
* [Unquantized 7B model in HF format](https://huggingface.co/TheBloke/koala-7B-HF)
|
15 |
* [Unquantized 7B model in GGML format for llama.cpp](https://huggingface.co/TheBloke/koala-7b-ggml-unquantized)
|
|
|
|
|
16 |
|
17 |
## How to run in `llama.cpp`
|
18 |
|
|
|
6 |
|
7 |
This version has then been quantized to 4-bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) and then converted to GGML for use with [llama.cpp](https://github.com/ggerganov/llama.cpp).
|
8 |
|
9 |
+
## My Koala repos
|
10 |
+
I have the following Koala model repositories available:
|
11 |
+
**13B models:**
|
12 |
* [Unquantized 13B model in HF format](https://huggingface.co/TheBloke/koala-13B-HF)
|
13 |
+
* [GPTQ quantized 4bit 13B model in `pt` and `safetensors` formats](https://huggingface.co/TheBloke/koala-13B-GPTQ-4bit-128g)
|
14 |
+
* [GPTQ quantized 4bit 13B model in GGML format for `llama.cpp`](https://huggingface.co/TheBloke/koala-13B-GPTQ-4bit-128g-GGML)
|
15 |
+
**7B models:**
|
16 |
* [Unquantized 7B model in HF format](https://huggingface.co/TheBloke/koala-7B-HF)
|
17 |
* [Unquantized 7B model in GGML format for llama.cpp](https://huggingface.co/TheBloke/koala-7b-ggml-unquantized)
|
18 |
+
* [GPTQ quantized 4bit 7B model in `pt` and `safetensors` formats](https://huggingface.co/TheBloke/koala-7B-GPTQ-4bit-128g)
|
19 |
+
* [GPTQ quantized 4bit 7B model in GGML format for `llama.cpp`](https://huggingface.co/TheBloke/koala-7B-GPTQ-4bit-128g-GGML)
|
20 |
|
21 |
## How to run in `llama.cpp`
|
22 |
|