---
base_model: unsloth/Meta-Llama-3.1-8B-bnb-4bit
library_name: peft
license: apache-2.0
datasets:
- Respair/sharegpt_chatml_compressed
- diwank/llmlingua-compressed-text
- AlexMaclean/wikipedia-deletion-compressions
- AlexMaclean/all-deletion-compressions
- sentence-transformers/sentence-compression
language:
- en
tags:
- compression
---

# Model Card for Model ID

Memories - Token Compressor for Long-Range Dependency Conversations


## Model Details

### Model Description

This model is a fine-tuned version of the Llama 3.1 8B 4-bit model, specifically trained for token compression tasks. It uses LoRA (Low-Rank Adaptation) for efficient fine-tuning while maintaining the base model's performance.

- **Developed by:** Alosh Denny
- **Funded by [optional]:** nil
- **Shared by [optional]:** nil
- **Model type:** Token Compressor for Memories
- **Language(s) (NLP):** English
- **License:** apache-2.0

## Uses

### Direct Use

This model is designed for token compression tasks. It can be used to generate more concise versions of input text while preserving the essential meaning.

### Downstream Use [optional]

The compressed outputs from this model can be used in various NLP applications where text length is a constraint, such as summarization, efficient text storage, or as input for other language models with token limits.

### Out-of-Scope Use

This model should not be used for tasks that require full preservation of the original text or where nuanced details are critical. It's not suitable for legal, medical, or other domains where precise wording is essential.

## Bias, Risks, and Limitations

- The model may inadvertently remove important context or nuance during compression.
- There might be biases inherited from the base Llama 3.1 model or introduced during fine-tuning.
- The model's performance may vary depending on the input text's domain or complexity.

### Recommendations

- Users should review the compressed outputs for accuracy and appropriateness before use in critical applications.
- It's advisable to test the model on a diverse range of inputs to understand its performance across different text types and domains.

## How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

## Training Details

### Training Data

The model was trained on a dataset compiled from various sources, including:
- Respair/sharegpt_chatml_compressed
- diwank/llmlingua-compressed-text
- AlexMaclean/wikipedia-deletion-compressions
- AlexMaclean/all-deletion-compressions
- sentence-transformers/sentence-compression

### Training Procedure

#### Preprocessing

Prompt-response pairs were processed from the datasets and compiled into a single dataset (available at https://huggingface.co/datasets/aoxo/token_compressor). Unwanted characters, trailing whitespaces and inverted commas were voided.

#### Training Hyperparameters

#### Training Hyperparameters

- **Training regime:** bf16 mixed precision
- **Optimizer:** paged_adamw_8bit
- **Learning rate:** 2e-4
- **LR scheduler:** cosine
- **Batch size:** 4 per device
- **Gradient accumulation steps:** 16
- **Number of epochs:** 10
- **Max steps:** 700,472

#### LoRA Configuration

- **r:** 8
- **lora_alpha:** 16
- **lora_dropout:** 0.05
- **bias:** none
- **task_type:** CAUSAL_LM

#### Speeds, Sizes, Times

- **Total Training Compute Throughput:** 8.62 PFLOPS
- **Total Logged Training Time:** 1422.31 hours
- **Start Time:** 07-21-2024 02:02:32
- **End Time:** 09-18-2024 08:21:08
- **Checkpoint Size (adapter):** 13,648,432 bytes

## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

### Testing Data, Factors & Metrics

#### Testing Data

<!-- This should link to a Dataset Card if possible. -->

[More Information Needed]

#### Factors

<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->

[More Information Needed]

#### Metrics

<!-- These are the evaluation metrics being used, ideally with a description of why. -->

[More Information Needed]

### Results

[More Information Needed]

#### Summary


## Model Examination [optional]

<!-- Relevant interpretability work for the model goes here -->

[More Information Needed]

## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

- **Hardware Type:** [More Information Needed]
- **Hours used:** [More Information Needed]
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]

## Technical Specifications [optional]

### Model Architecture and Objective

[More Information Needed]

### Compute Infrastructure

[More Information Needed]

#### Hardware

[More Information Needed]

#### Software

[More Information Needed]

## Citation [optional]

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

**BibTeX:**

[More Information Needed]

**APA:**

[More Information Needed]

## Glossary [optional]

<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

[More Information Needed]

## More Information [optional]

[More Information Needed]

## Model Card Authors [optional]

[More Information Needed]

## Model Card Contact

[More Information Needed]
### Framework versions

- PEFT 0.12.0