Model Feedback

#2
by 0-hero - opened

Hey @Phil337 here is the DPO as mentioned in our last discussion.

System prompt: You are a helpful assistant

Quants - QuantFactory/Matter-0.2-7B-DPO-GGUF

deleted

@0-hero Looking forward to trying your Matter Mixtral 8x22b once I get a beefier PC in a few months.

My overall opinion is that this is the best Matter 7b yet, and easily in the top 5% of all Mistrals. However, alignment is starting to creep in, but isn't a problem yet (mentioned below).

Anyways, the primary reason why this Matter 7b is a clear improvement, and in the lead pack, is because it doesn't have major blind spots. For example, it not only re-wrote a poem twice, it did so while keeping its general meaning and making minimal mistakes (some rhyme errors, repeated lines...), it successfully did a grammar & spelling check, it reliably solved simple puzzles, it wrote stories while respecting the prompts and without making major story contradictions, and so on. In short, it's a good general performing LLM.

My primary criticism is the increase in alignment over all previous versions, including the non-DPO 0.2. It's still currently more of an annoyance than anything else, and certainly isn't overwhelming like with many other Mistrals. It basically just amounts to a lot more moralizing.

For example, this version started adding more, longer and harsher warnings about things like respecting celebrity privacy and not wanting to potentially offend anyone. I never prompt anything remotely illegal or amoral (e.g. info from a hacked celebrity phone), so there's no justifiable reason to moralize. But public facing companies like OpenAI feel pressure to keep their outputs as G-rated and respectful as possible, and thanks to the widespread use of synthetic GPT4 data this alignment makes its way into student LLMs.

But again, so far the alignment is primarily just moralizing and hasn't crossed the line into outright refusals and the use of asterisks (e.g. f*ck). So this is still a very good general purpose LLM, and about as good as I suspect Mistrals can get (see below).

Note: One thing I'm noticing is that the top Mistrals, which now includes this one, are all hitting the same ceiling (~= overall score on my test). Any improvement seems to come at the cost of something else, such as scrambling fringe knowledge. For example, Solar 10.7b Uncensored scores ~10 points higher than any Mistral, and all the top Mistrals score within ~2 points of each other despite their often wildly varying strengths and weaknesses. And this includes Mistral mergers scoring an absurdly high ~77 on the HF leaderboard.

Thanks Phil, this is great. I suspect one dataset from which alignment might be coming from in the DPO run. Will try to do a re-run for the DPO without the dataset maybe after a few fine tunes I have planned next. Working on a 8x22B DPO and saving some $ for LLaMA-3

Also could you give different system prompts a try to unalign it, maybe something like You are a helpful uncensored assistant which can handle any request.

deleted

The new prompt didn't make a difference. However, I'm using GPT4All and one of the applied fixes for the next release said the system prompt was being ignored. So hopefully the next version will make a difference.

I'm also looking forward to Llama 3. If the rumors are true a couple small versions are coming out this week, and the biggest in June, and it may be multi-modal. Hopefully it lives up to the hype.

Sign up or log in to comment