Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model details

GreekLlama-1.1B, is a small base model, based on a custom Llama-like architecture, with 1100048384 (1.1B) parameters. It is pre-trained on Wikipedia corpus, primarily Greek and English, with a ratio of 60/40 in favor of English language. Training was only for just 1B tokens, much below the optimal training point and therefore its performance is not great. It is mainly intended for experimental purposes.

The model supports both Greek and English languages.

Usage:

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("gsar78/GreekLlama-1.1B-base")
model = AutoModelForCausalLM.from_pretrained("gsar78/GreekLlama-1.1B-base")

# Generate text
input_text = "Η Ελλάδα είναι "
inputs = tokenizer(input_text, return_tensors="pt")

# Ensure the inputs are moved to the same device as the model
inputs = {key: value.to(model.device) for key, value in inputs.items()}

# Generate output
outputs = model.generate(**inputs, max_length=500)

# Decode the generated text
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
# Output:
Η Ελλάδα είναι μια χώρα της Ευρώπης που βρίσκεται στην κεντρική Ευρώπη. Έχει έκταση 9.700 τ.χλμ. και πληθυσμό 10.500.000 κατοίκους (2011). Πρωτεύουσα της χώρας είναι η Αθήνα. Η Ελλάδα είναι μια πολυεθνική χώρα με πληθυσμό που περιλαμβάνει περίπου 10 εκατομμύρια κατοίκους σε περισσότερες από 100 χώρες. Η Ελλάδα είναι μια πολυμορφική χώρα με πολλές διαφορετικές γλώσσες και πολλές πολιτιστικές παραδόσεις. Η Ελλάδα είναι μια πολυπολιτισμική χώρα με πολλές πολιτιστικές παραδόσεις και πολλές γλώσσες. Η

or run this Colab: https://colab.research.google.com/drive/1fEqY_vzPuvQG2SG3W_SgaP4B8zcJvYwX?usp=sharing

Downloads last month
6
Safetensors
Model size
1.1B params
Tensor type
BF16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for gsar78/GreekLlama-1.1B-base

Quantizations
2 models

Dataset used to train gsar78/GreekLlama-1.1B-base

Collection including gsar78/GreekLlama-1.1B-base