Help regarding snapshot download of huggingFace

#9
by PoyBoi - opened

Ok so I want to specifically download the "Wizard-Vicuna-13B-Uncensored.ggmlv3.q4_K_S.bin" model using hugging face's snapshot_download, when I do try to run it using the correct path, it gives me the error "repo_id must be in the form ....", and when I do provide it the repo id, as expected, it starts to download all the models, is there any way I can download only that one singular model ? What's would be the syntax for it?

(ps: using modal, so I can't exactly get in and manage the files personally)

Yeah that is possible. Here's an example I copied from here: https://github.com/OpenAccess-AI-Collective/servereless-runpod-ggml

repo_file = hf_hub_download(repo_id=os.environ["GGML_REPO"], filename=os.environ["GGML_FILE"], revision=os.environ.get("GGML_REVISION", "main"))

Prior to running that it sets up these env vars:

ENV HF_DATASETS_CACHE="/runpod-volume/huggingface-cache/datasets"
ENV HUGGINGFACE_HUB_CACHE="/runpod-volume/huggingface-cache/hub"
ENV TRANSFORMERS_CACHE="/runpod-volume/huggingface-cache/hub"

So the file gets downloaded into HF's cache system which is stored on /runpod-volume, and in Python code you download a file via hf_hub_download which returns a filename for that file (which will be a symlink to the file in the cache I believe) which you can then load like any GGML file.

It's not hf_snapshot_download but you can specify a revision so I believe the result is the same.

from huggingface_hub import hf_hub_download

repo = "TheBloke/Wizard-Vicuna-13B-Uncensored-GGML"
file_path = "Wizard-Vicuna-13B-Uncensored.ggmlv3.q4_K_S.bin"
hf_hub_download(repo_id=repo, filename=file_path)

This worked for me, thank you so much!

PoyBoi changed discussion status to closed

Sign up or log in to comment