New discussion

Requirements

#52 opened 15 days ago by sneakybeaky

Finetuning llama2

#47 opened 10 months ago by zuhashaik

Any example of batch inference?

#46 opened 11 months ago by PrintScr

How to set max_split_size_mb?

1
#30 opened about 1 year ago by neo-benjamin

max_position_embeddings = 2048?

1
#29 opened about 1 year ago by zzzac

Load into 2 GPUs

3
#28 opened about 1 year ago by sauravm8

Load model into TGI

#27 opened about 1 year ago by schauppi

Perplexity

#22 opened about 1 year ago by gsaivinay

70TB with multiple A5000

6
#21 opened about 1 year ago by nashid

Inference error, tensor shapes.

8
#18 opened about 1 year ago by alejandrofdz

Inference time with TGI

1
#15 opened about 1 year ago by jacktenyx

Can't launch with TGI

6
#14 opened about 1 year ago by yekta

Bloke - add 70B ggml version please

4
#8 opened about 1 year ago by mirek190

text-generation-inference error

7
#5 opened about 1 year ago by msteele

Output always 0 tokens

11
#4 opened about 1 year ago by sterogn