my i use deepspeed to accelerate the model infer process?

#5
by 520jefferson - opened

when i use the model.chat(), it's too slow to tolerate. so is there any ways to accelerate?

Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University org

in our github,try to use trt or other accelerate way

zRzRzRzRzRzRzR changed discussion status to closed

Sign up or log in to comment