Model Card for Llama 3 Instruct Fine-Tuned on Deutsche Bahn FAQ

Model Overview

Model Name: islam-hajosman/llama3_instruct_fine_tuned_bahn_1k_v1_model
Architecture: Llama 3 Instruct
Quantization: 4-bit NF4 with double quantization
Domain-Specific Fine-Tuning Dataset: islam-hajosman/deutsche_bahn_faq_1k

This model has been fine-tuned to provide improved answers for frequently asked questions (FAQ) from the Deutsche Bahn website. This is part of a Master's thesis project aiming to enhance the model's domain-specific capabilities.

Fine-Tuning Configuration

Quantization Configuration

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16
)

LoRA Configuration

lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.0,
    bias="none",
    target_modules=['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj'],
    task_type="CAUSAL_LM"
)
model = get_peft_model(model, lora_config)

Training Arguments

training_args = TrainingArguments(
    output_dir="./llama3_instruct_fine_tuned_bahn_1k_v1_output",
    dataloader_drop_last=False,
    save_strategy="epoch",
    logging_strategy="steps",
    num_train_epochs=30,
    logging_steps=1,
    per_device_train_batch_size=8,
    gradient_accumulation_steps=8,
    optim="adamw_8bit",
    learning_rate=1e-4,
    lr_scheduler_type="cosine",
    warmup_ratio=0.1,
    bf16=true, 
    weight_decay=0.0,
    run_name="llama3_instruct_fine_tuned_bahn_1k_v1_report",
    report_to="wandb"
)

Sequence Length Configuration

max_seq_length was set to 512, covering 99.3% of the data. Only 7 entries exceeded this length, which were truncated accordingly.

Hardware Used

GPU: 1x H100 (80 GB PCIe)
CPU: 26 cores
RAM: 205.4 GB
Storage: 1.1 TB SSD
Cost: $2.5 per hour

Training Summary

Total Trainable Parameters: 0.915% of 8B parameters
LoRA-Adaptor Size: 4.37GB
Training Time and Cost: $2 for 50 minutes
Number of Steps per Epoch: 16 (based on 1024 samples, batch size 8, gradient accumulation 8)

Performance Metrics

Training Completed:
- TrainOutput(global_step=480, training_loss=0.28411184588912874, metrics={'train_runtime': 3012.7974, 'train_samples_per_second': 10.197, 'train_steps_per_second': 0.159, 'total_flos': 3.871795189898281e+17, 'train_loss': 0.28411184588912874, 'epoch': 30.0})

Weights & Biases Tracking

Weights & Biases Dashboard

Usage

To use this model, load it from Huggingface using the model name islam-hajosman/llama3_instruct_fine_tuned_bahn_1k_v1_model. This model is optimized for providing domain-specific answers to Deutsche Bahn FAQ.

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("islam-hajosman/llama3_instruct_fine_tuned_bahn_1k_v1_model")
model = AutoModelForCausalLM.from_pretrained("islam-hajosman/llama3_instruct_fine_tuned_bahn_1k_v1_model")

input_text = "Ihre Frage hier"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)

islam-hajosman
/

llama3_instruct_fine_tuned_bahn_1k_v1_model