llm.c checkpoint: GPT-2 774M

This is a HF/safetensors conversion of the llm.c checkpoint of a 774M parameter run on 150B tokens from FineWeb.

Training was conducted on a single 8xA100 80GB SXM node for ~6 days.

See discussion on GitHub for more information.

Downloads last month: 87

Safetensors

Model size

774M params

Tensor type

BF16

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train mdouglas/llmc-gpt2-774M-150B

Collection including mdouglas/llmc-gpt2-774M-150B

llm.c

Collection

Models trained with llm.c • 2 items • Updated Jun 12