File size: 5,883 Bytes

---
base_model:
- yuvraj17/EvolCodeLlama-3.1-8B-Instruct
- yzhuang/Meta-Llama-3-8B-Instruct_fictional_gsm8k_English_v1
tags:
- merge
- mergekit
- lazymergekit
- yuvraj17/EvolCodeLlama-3.1-8B-Instruct
- yzhuang/Meta-Llama-3-8B-Instruct_fictional_gsm8k_English_v1
---

# Llama3-8B-Instruct-Slerp

Llama3-8B-Instruct-Slerp is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
* [yuvraj17/EvolCodeLlama-3.1-8B-Instruct](https://huggingface.co/yuvraj17/EvolCodeLlama-3.1-8B-Instruct)
* [yzhuang/Meta-Llama-3-8B-Instruct_fictional_gsm8k_English_v1](https://huggingface.co/yzhuang/Meta-Llama-3-8B-Instruct_fictional_gsm8k_English_v1)

## 🧩 Configuration

```yaml
slices:
  - sources:
      - model: yuvraj17/EvolCodeLlama-3.1-8B-Instruct
        layer_range: [0, 32]
      - model: yzhuang/Meta-Llama-3-8B-Instruct_fictional_gsm8k_English_v1
        layer_range: [0, 32]
merge_method: slerp
base_model: yuvraj17/EvolCodeLlama-3.1-8B-Instruct
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: float16

```

## 💻 Usage

```python
!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "yuvraj17/Llama3-8B-Instruct-Slerp"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```

# 🏆 Evaluation Scores

## Nous

|                                                     Model                                   |AGIEval|TruthfulQA|Bigbench|
|---------------------------------------------------------------------------------------------|------:|---------:|-------:|
|[yuvraj17/Llama3-8B-Instruct-Slerp](https://huggingface.co/yuvraj17/Llama3-8B-Instruct-Slerp)|  38.32|     57.15|   43.91|


### AGIEval
|             Task             |Version| Metric  | Value |   | Stderr |
|------------------------------|------:|---------|------:|---|-------:|
| agieval_aqua_rat              |      0| acc     | 23.62 |±  |  2.67  |
|                              |       | acc_norm| 22.05 |±  |  2.61  |
| agieval_logiqa_en             |      0| acc     | 27.50 |±  |  1.75  |
|                              |       | acc_norm| 31.80 |±  |  1.83  |
| agieval_lsat_ar               |      0| acc     | 21.30 |±  |  2.71  |
|                              |       | acc_norm| 20.87 |±  |  2.69  |
| agieval_lsat_lr               |      0| acc     | 35.29 |±  |  2.12  |
|                              |       | acc_norm| 37.65 |±  |  2.15  |
| agieval_lsat_rc               |      0| acc     | 42.01 |±  |  3.01  |
|                              |       | acc_norm| 39.78 |±  |  2.99  |
| agieval_sat_en                |      0| acc     | 55.83 |±  |  3.47  |
|                              |       | acc_norm| 50.49 |±  |  3.49  |
| agieval_sat_en_without_passage|      0| acc     | 36.89 |±  |  3.37  |
|                              |       | acc_norm| 34.95 |±  |  3.33  |
| agieval_sat_math              |      0| acc     | 29.55 |±  |  3.08  |
|                              |       | acc_norm| 28.64 |±  |  3.05  |

**Average score**: 33.28%

### TruthfulQA


|        Task         |Version| Metric | Value |   | Stderr |
|---------------------|------:|--------|------:|---|-------:|
| truthfulqa_mc       |      1| mc1    | 33.54 |±  |  1.65  |
|                     |       | mc2    | 49.78 |±  |  1.53  |

**Average score**: 49.78%

### BigBench

|                Task                |Version|        Metric         | Value |   | Stderr |
|------------------------------------|------:|-----------------------|------:|---|-------:|
| bigbench_causal_judgement          |      0| multiple_choice_grade  | 47.89 |±  |  3.63  |
| bigbench_date_understanding        |      0| multiple_choice_grade  | 39.02 |±  |  2.54  |
| bigbench_disambiguation_qa         |      0| multiple_choice_grade  | 33.72 |±  |  2.95  |
| bigbench_geometric_shapes          |      0| multiple_choice_grade  | 20.61 |±  |  2.14  |
| bigbench_logical_deduction_five_objects|  0| multiple_choice_grade  | 31.40 |±  |  2.08  |
| bigbench_logical_deduction_seven_objects| 0| multiple_choice_grade  | 23.71 |±  |  1.61  |
| bigbench_logical_deduction_three_objects| 0| multiple_choice_grade  | 47.00 |±  |  2.89  |
| bigbench_movie_recommendation      |      0| multiple_choice_grade  | 27.40 |±  |  1.99  |
| bigbench_navigate                  |      0| multiple_choice_grade  | 50.10 |±  |  1.58  |
| bigbench_reasoning_about_colored_objects| 0| multiple_choice_grade  | 38.40 |±  |  1.09  |
| bigbench_ruin_names                |      0| multiple_choice_grade  | 27.23 |±  |  2.11  |
| bigbench_salient_translation_error_detection| 0| multiple_choice_grade  | 25.45 |±  |  1.38  |
| bigbench_snarks                    |      0| multiple_choice_grade  | 46.41 |±  |  3.72  |
| bigbench_sports_understanding      |      0| multiple_choice_grade  | 50.30 |±  |  1.59  |
| bigbench_temporal_sequences        |      0| multiple_choice_grade  | 37.30 |±  |  1.53  |
| bigbench_tracking_shuffled_objects_five_objects| 0| multiple_choice_grade  | 21.36 |±  |  1.16  |
| bigbench_tracking_shuffled_objects_seven_objects| 0| multiple_choice_grade  | 17.14 |±  |  0.90  |
| bigbench_tracking_shuffled_objects_three_objects| 0| multiple_choice_grade  | 47.00 |±  |  2.89  |

**Average score**: 35.38%