Edit model card
Configuration Parsing Warning: In adapter_config.json: "peft.base_model_name_or_path" must be a string

jayson

This model is a fine-tuned version of TheBloke/Mistral-7B-Instruct-v0.2-GPTQ on the saanvi-bot/test-json-data dataset.

It achieves the following results on the evaluation set:

  • Loss: 0.6283

Model description

This model leverages the capabilities of the Mistral 7B architecture, fine-tuned to generate structured JSON outputs from unstructured text inputs. This fine-tuning enables the model to produce accurate and high-quality JSON data, given relevant input information, specifically tailored for use cases in the shipping industry.

How to use this model

Note: This code runs fine on colab

from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "TheBloke/Mistral-7B-Instruct-v0.2-GPTQ"
model = AutoModelForCausalLM.from_pretrained(model_name,
                                             device_map="auto",
                                             trust_remote_code=False,
                                             revision="main")

config = PeftConfig.from_pretrained("saanvi-bot/jayson")
model = PeftModel.from_pretrained(model, "saanvi-bot/jayson")

# load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)

comment = " <your text data> "

intstructions_string = f"""convert into json format \n"""
prompt_template = lambda comment: f'''[INST] {intstructions_string} \n{comment} \n[/INST]'''
prompt = prompt_template(comment)

model.eval()

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"), max_new_tokens=512)

print(tokenizer.batch_decode(outputs)[0])

Intended uses & limitations

Intended Uses:

  • Automated Data Formatting: Convert unstructured shipping data into structured JSON format.
  • Data Integration: Facilitate the integration of shipping data into databases and applications that require JSON formatted data.
  • API Development: Use the model to build APIs that provide JSON formatted shipping data from textual descriptions.

Training and evaluation data

The model was trained and evaluated on a dataset containing text-to-JSON data of a shipping company, provided by the user. The dataset includes various text descriptions related to shipping activities, which the model learns to convert into JSON format.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.9042 0.9231 3 0.7967
0.8196 1.8462 6 0.7345
0.7337 2.7692 9 0.6936
0.5122 4.0 13 0.6707
0.6736 4.9231 16 0.6567
0.6092 5.8462 19 0.6438
0.6016 6.7692 22 0.6356
0.4497 8.0 26 0.6303
0.5784 8.9231 29 0.6285
0.3899 9.2308 30 0.6283

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for saanvi-bot/jayson

Dataset used to train saanvi-bot/jayson

Space using saanvi-bot/jayson 1