Feedback

#1
by bobp - opened

First let me say Marlin v5 has been one of my favorite Nemo finetunes, it's like an improved version of Nemo-Instruct without going too crazy with an innate writing style. It can write about more topics in more interesting styles than the official instruct tune while keeping the rather neutral tone. In roleplays it follows the context well without assuming too much or jumping the gun to developments the user did not intend. Personally I think it flew under a lot of people's radar as they were focused on the more well-known finetunes like the magnum series. I never tried v6 because there was no Q5/Q6 for me and v7 supposedly was not that good so I'm comparing v8 to v5.

For v8 I'm using mradermacher's Q5_K_M. The overall writing style and verbosity is great. Compared to v5 it's less clinical and less dry and writes longer passages by default. Using this model in Tavern is very fun. However the cost of being more fun means being slightly less neutral. Situations can escalate fast, like simply touching another character's body might lead to a romantic or erotic development even though no feelings between the two characters were implied to exist yet. It's not too extreme though (especially compared to some other finetunes I've tried) so overall I would not consider it much of an issue.

What I do consider to be somewhat of an issue in v8 is that there seems to be some refusals/censorship that was not present at all in v5. Every time you prompt for something explicit there is a chance of refusal. It's not really an issue in Tavern use but is an issue in instruct use. For example this extremely generic prompt for sexual content got me a refusal:

image.png

If we check the token probabilities there's an 11.26% chance the reply will start with "I" (minP 0.05 temp 0.5) and if that happens there is an excellent chance it will continue to a refusal

image.png

If the wording or the requested content is more explicit the chance of refusal can be a lot higher as well.
I think in general the problem is having this assistant-like replying trained in. Instead of simply giving you the content like v5 does, it will preface the answer.
If I prompt for something I think the ideal reply is just giving me the content I prompted for instead of first saying "Okay, here is your content:" or adding something like "How was that, did you like what I wrote?" at the end.

Other than the refusals I think this is a really, really fun model and I think you are doing something really cool with the Marlin series overall. I wish you luck in your fine-tuning endeavors.

Thank you for the feedback! I really appreciate it.

Compared to v5 it's less clinical and less dry and writes longer passages by default.

That'll be the difference between the Nemo-instruct used for v5 and the Nemo-Base for v8.

I think in general the problem is having this assistant-like replying trained in

I've also seen more refusals in v8 - I suspect you're onto something and it's from the larger selection of Claude prompts in the training data. I'll take a pass thru them and see what I can dig up.

Sign up or log in to comment