Edit model card

Llama3 8B Wordcel

Wordcel is a Llama3 fine-tune intended to be used as a mid-training checkpoint for more specific RP/storywriting/creative applications.

It has been trained from Llama3 8B Base on a composite dataset of ~100M tokens that highlights reasoning, (uncensored) stories, classic literature, and assorted interpersonal intelligence tasks.

Components of the composite dataset include OpenHermes-2.5, and Grimulkan's Theory of Mind and Physical Reasoning datasets.

It is trained at a context length of 32k tokens, using linear RoPE scaling with a factor of 4.0. Derivative models should be capable of generalizing to 32k tokens as a result.

If you train a model using this checkpoint, please give clear attribution! The Llama 3 base license likely applies.

Downloads last month
15
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for jspr/llama3-wordcel

Finetuned
this model
Finetunes
4 models

Datasets used to train jspr/llama3-wordcel

Collection including jspr/llama3-wordcel