Context length

#1
by sseres - opened

In the model card this is stated:
max_seq_length = 32 768
Why I am getting this then?
Number of tokens (1442) exceeded maximum context length (512).
Part of the code:
llm = AutoModelForCausalLM.from_pretrained("ariel-ml/PULI-LlumiX-32K-instruct-GGUF", model_file="PULI-LlumiX-32K-instruct-Q5_K_S.gguf", model_type="llama", gpu_layers=50)
print(llm("""<|im_start|>system
Context information is below.Given the context information and not prior knowledge, answer the query.
\n---------------------\npage_label: 7\nfile_name: test.pdf
.
.
<|im_end|>
<|im_start|>user
Mivel foglalkozik az adott cég?<|im_end|><|im_start|>assistant<|im_end|>"""))

sseres changed discussion status to closed

It seems like you are using the ctransformers library. You can change the default context_length and max_new_token parameters.

Documentation:
https://github.com/marella/ctransformers

from ctransformers import AutoModelForCausalLM

llm = AutoModelForCausalLM.from_pretrained(
    "ariel-ml/PULI-LlumiX-32K-instruct-GGUF", 
    model_file="PULI-LlumiX-32K-instruct-Q5_K_S.gguf", 
    model_type="llama",
    max_new_tokens=2048,
    context_length=2048,
    gpu_layers=50
)

Sign up or log in to comment