Transformers
GGUF
English
mistral
Generated from Trainer
text-generation-inference

Addressing Inconsistencies in Model Outputs: Understanding and Solutions

#5
by shivammehta - opened

When experimenting with this model, I've observed occasional discrepancies in its output. Sometimes it provides the correct response, and sometimes times it doesn't, even when presented with the same or similar questions. I have two inquiries: Why does this occur, and how can we address this issue?
The output of Agent goes into an infinite loop with the LLM not making changes to its reasoning as can
be seen in the highlighted block… this block keeps repeating till the agent runs out of iterations and
hence does not arrive at the final answer.

Code -
from huggingface_hub import hf_hub_download
from langchain.llms import LlamaCpp
from langchain.agents import create_csv_agent

MODEL_ID = "TheBloke/zephyr-7B-beta-GGUF"
MODEL_BASENAME = "zephyr-7b-beta.Q4_K_M.gguf"

CONTEXT_WINDOW_SIZE = 4096
MAX_NEW_TOKENS = 1024

model_path = hf_hub_download(
repo_id=MODEL_ID,
filename=MODEL_BASENAME,
resume_download=True,
cache_dir="./models",
)
kwargs = {
"model_path": model_path,
"n_ctx": CONTEXT_WINDOW_SIZE,
"max_tokens": MAX_NEW_TOKENS,
"n_gpu_layers":4
}
llm = LlamaCpp(
model_path=model_path,
temperature=0.1,
n_ctx=4096,
max_tokens=1024,
n_batch=100,
top_p=1,
verbose=True,
n_gpu_layers=100)

agent = create_csv_agent(llm, ['./Data/Employees.csv','./Data/Verticals.csv'], verbose=True)
response = agent.run("Which vertical name has the most number of resignations")
print(response)

image.png

QUERY:

  1. How to correct such reasoning of the LLM such that the LLM reasons out that its actions are
    being repeated but not helping at arriving at the right answer? Does the AgentExecutor need
    to be corrected in this case ? If yes, how and what needs to be done?

Sign up or log in to comment