Limited (truncated) response with inference API

#23

by RobertTaylor - opened May 28

May 28

I am getting a limited output when using the inference API. This is also the case within the on-page example. Is HF's rate-limiting per-token?

May 28

I am getting a limited output when using the inference API. This is also the case within the on-page example. Is HF's rate-limiting per-token?

Hi there, did you set max_new_tokens ?

May 28

Gosh, thanks for that. Sorry, I'm an idiot.

19 days ago

what are the other parameters?

19 days ago

what are the other parameters?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment