onnx-community/Llama-3.2-1B-Instruct · Model outputting nonsense?

Hello, I am new to the AI world and have been stuck on this issue for a few days.

I am running a local Node.js environment and have incorporated the model into a chatbot using transformers.js. Everything is working fine, and I can chat back and forth with the model, but it never responds accurately. It always just repeats my prompt, and then adds nonsense text to the end of it (coherent sentences, just rambling).

This seems to be an issue with the processing of my prompt, but I am confused as to why it appends my prompt at the beginning of each response, and also where the rambling is coming from. To my understanding, the model and weights are included in this repo and should import properly when I initialize my generator in my server-side code:

generator = await pipeline('text-generation', 'onnx-community/Llama-3.2-1B-Instruct');

Any recommendations as to how I can get it responding appropriately like the examples and not with nonsense text? Is this related to the parameters I am passing to the generator (currently only using max_new_tokens, but have tried adjusting temperature as well to no avail).

import { pipeline } from "@huggingface/transformers"; // Create a text generation pipeline const generator = await pipeline("text-generation", "onnx-community/Llama-3.2-1B-Instruct"); // Define the list of messages const messages = [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Tell me a joke." }, ]; // Generate a response const output = await generator(messages, { max_new_tokens: 128 }); console.log(output[0].generated_text.at(-1).content);