jayson
This model is a fine-tuned version of TheBloke/Mistral-7B-Instruct-v0.2-GPTQ on the saanvi-bot/test-json-data dataset.
It achieves the following results on the evaluation set:
- Loss: 0.6283
Model description
This model leverages the capabilities of the Mistral 7B architecture, fine-tuned to generate structured JSON outputs from unstructured text inputs. This fine-tuning enables the model to produce accurate and high-quality JSON data, given relevant input information, specifically tailored for use cases in the shipping industry.
How to use this model
Note: This code runs fine on colab
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "TheBloke/Mistral-7B-Instruct-v0.2-GPTQ"
model = AutoModelForCausalLM.from_pretrained(model_name,
device_map="auto",
trust_remote_code=False,
revision="main")
config = PeftConfig.from_pretrained("saanvi-bot/jayson")
model = PeftModel.from_pretrained(model, "saanvi-bot/jayson")
# load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
comment = " <your text data> "
intstructions_string = f"""convert into json format \n"""
prompt_template = lambda comment: f'''[INST] {intstructions_string} \n{comment} \n[/INST]'''
prompt = prompt_template(comment)
model.eval()
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"), max_new_tokens=512)
print(tokenizer.batch_decode(outputs)[0])
Intended uses & limitations
Intended Uses:
- Automated Data Formatting: Convert unstructured shipping data into structured JSON format.
- Data Integration: Facilitate the integration of shipping data into databases and applications that require JSON formatted data.
- API Development: Use the model to build APIs that provide JSON formatted shipping data from textual descriptions.
Training and evaluation data
The model was trained and evaluated on a dataset containing text-to-JSON data of a shipping company, provided by the user. The dataset includes various text descriptions related to shipping activities, which the model learns to convert into JSON format.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.9042 | 0.9231 | 3 | 0.7967 |
0.8196 | 1.8462 | 6 | 0.7345 |
0.7337 | 2.7692 | 9 | 0.6936 |
0.5122 | 4.0 | 13 | 0.6707 |
0.6736 | 4.9231 | 16 | 0.6567 |
0.6092 | 5.8462 | 19 | 0.6438 |
0.6016 | 6.7692 | 22 | 0.6356 |
0.4497 | 8.0 | 26 | 0.6303 |
0.5784 | 8.9231 | 29 | 0.6285 |
0.3899 | 9.2308 | 30 | 0.6283 |
Framework versions
- PEFT 0.11.1
- Transformers 4.41.2
- Pytorch 2.3.0+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 0
Model tree for saanvi-bot/jayson
Base model
mistralai/Mistral-7B-Instruct-v0.2