ChuckMcSneed
/

DoubleGold-v0.5-123b-32k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

DoubleGold-v0.5-123b-32k / README.md

ChuckMcSneed's picture

Update README.md

790abd5 verified 9 months ago

|

No virus

1.71 kB

	---
	license: llama2
	tags:
	- merge
	- mergekit
	---
	I merged [Aurelian](https://huggingface.co/grimulkan/aurelian-v0.5-70b-rope8-32K-fp16) with itself using [mergekit](https://github.com/cg123/mergekit), creating this EXTENDED LENGTH FRANKENSTEIN.
	# Does it work
	Yes, at 17k it stays coherent, but starts to lose minor details of the story. Not sure how well it performs at 32k though. Quants have a sinificant impact on quality for this model, going from Q6_K to Q5_K had a noticeable drop in quality.
	# Is it worth it
	Maybe? Depends? Do you hate mixtral? Do you have good hardware/patience? Do you need a somewhat smart model with 32k context?
	# Known issues
	VERY strict adherence to prompt format, forgetfullness, strong roleplay bias.
	# Personal opinion
	Dumber than Goliath, but has much less GPTism. If you want 32k goliath, maybe try [Goliath-longLORA-120b-rope8-32k-fp16](https://huggingface.co/grimulkan/Goliath-longLORA-120b-rope8-32k-fp16).
	# Prompt format
	Same as [Aurelian 0.5](https://huggingface.co/grimulkan/aurelian-v0.5-70b-rope8-32K-fp16).
	```
	[INST] <<SYS>>
	System prompt, default is: An interaction between a user providing instructions, and an imaginative assistant providing responses.
	<</SYS>>
	</s><s>[INST] {Put your input text here.}
	[/INST] {Model output}
	```
	This model doesn't like it too much when you change the prompt, so even keeping that ```</s><s>``` is important.
	# Benchmarks
	### NeoEvalPlusN_benchmark
	[My meme benchmark.](https://huggingface.co/datasets/ChuckMcSneed/NeoEvalPlusN_benchmark)
	\| Test name \| Aurelian \| DoubleGold \|
	\| ---------- \| ---------- \| ------- \|
	\| B \| 1 \| 1 \|
	\| C \| 1 \| 1 \|
	\| D \| 0 \| 2 \|
	\| S \| 2.5 \| 3.25 \|
	\| P \| 2.25 \| 1.5 \|
	\| Total \| 6.75 \| 8.75 \|