Edit model card

This is a train starting from an empty model based exclusively on Italian language datasets (currently redpajama 2023-14 it)

the train is ongoing and will extend to new datasets.

More precise versions will be published shortly.

Train on my server, i have studied and adapted the model starting from the repository https://github.com/karpathy/llama2.c

  • LLama model parameter:
    • max_seq_len: (7b = 2048) The maximum sequence length for input data.
    • dim (7b= 4096) Represents the dimensionalityl
    • n_layers: (7b = 32) The number of layers
    • n_heads: (7b = 32) Determines the number of attention heads
    • n_kv_heads: (7b = 32) The number of key and value heads
    • multiple_of: (7b = 256) A value used to make the SwiGLU hidden layer size a multiple of a large power of 2
  • Model parameter
    • max_seq_len = 1024
    • dim = 768
    • n_layers = 32
    • n_heads = 32
    • n_kv_heads = 32
    • multiple_of = 32
      num decayed parameter tensors: 225, with 251,068,416 parameters
      num non-decayed parameter tensors: 65, with 49,920 parameters

To just use the model, you can run:

  
  # Load model directly
  from transformers import AutoTokenizer, AutoModelForCausalLM

  # Load the model and tokenizer
  tokenizer_model = AutoTokenizer.from_pretrained("peruginia/Llama-2-Small")
  model = AutoModelForCausalLM.from_pretrained("peruginia/Llama-2-Small")
  model.to('cuda')
  from tokenizer import Tokenizer

  # Define the prompt
  prompt = "Alessandro è un ragazzo che progetta Infissi"

  # Tokenize the prompt
  inputs    = tokenizer_model(prompt, return_tensors="pt").to('cuda')

  # Generate text
  output = model.generate(**inputs, do_sample = True, max_new_tokens=100, top_k = 300, top_p = 0.85, temperature = 1.0, num_return_sequences = 1)

  # Decode and print the generated text
  generated_text = tokenizer_model.decode(output[0], skip_special_tokens=True)

  print(generated_text)
Downloads last month
6
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.