Edit model card

Uploaded model

  • Developed by: Enzo Palmisano
  • License: mit
  • Finetuned from model : microsoft/Phi-3-mini-4k-instruct

Evaluation

For a detailed comparison of model performance, check out the Leaderboard for Italian Language Models.

Here's a breakdown of the performance metrics:

Metric hellaswag_it acc_norm arc_it acc_norm m_mmlu_it 5-shot acc Average
Accuracy Normalized 0.6088 0.4440 0.5667 0.5398

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

tokenizer = AutoTokenizer.from_pretrained("e-palmisano/Phi3-ITA-mini-4k-instruct", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("e-palmisano/Phi3-ITA-mini-4k-instruct", trust_remote_code=True)
model.to(device)


generation_config = GenerationConfig(
      penalty_alpha=0.6, # The values balance the model confidence and the degeneration penalty in contrastive search decoding.
      do_sample = True, # Whether or not to use sampling ; use greedy decoding otherwise.
      top_k=5, #  The number of highest probability vocabulary tokens to keep for top-k-filtering.
      temperature=0.001, #  The value used to modulate the next token probabilities.
      repetition_penalty=1.7, # The parameter for repetition penalty. 1.0 means no penalty.
      max_new_tokens = 64, # The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.
      eos_token_id=tokenizer.eos_token_id, # The id of the *end-of-sequence* token.
      pad_token_id=tokenizer.eos_token_id, # The id of the *padding* token.
  )


def generate_answer(question):
    messages = [
        {"role": "user", "content": question},
    ]
    model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
    outputs = model.generate(model_inputs, generation_config=generation_config)
    result = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
    return result


question = """Quale è la torre più famosa di Parigi?"""
answer = generate_answer(question)
print(answer)

Downloads last month
3,872
Safetensors
Model size
3.82B params
Tensor type
BF16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for e-palmisano/Phi3-ITA-mini-4K-instruct

Finetuned
this model