File size: 3,079 Bytes
180ab05
 
5a150fc
 
 
 
 
 
 
 
180ab05
 
5a150fc
180ab05
5a150fc
180ab05
5a150fc
180ab05
5a150fc
180ab05
5a150fc
180ab05
5a150fc
180ab05
5a150fc
 
 
 
 
 
 
180ab05
5a150fc
180ab05
5a150fc
180ab05
5a150fc
180ab05
5a150fc
 
180ab05
5a150fc
 
 
180ab05
5a150fc
 
180ab05
5a150fc
 
 
 
 
 
 
 
180ab05
5a150fc
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
library_name: transformers
tags:
- orpo
- llama 3
license: other
datasets:
- mlabonne/orpo-dpo-mix-40k
language:
- en
---

# OrpoLlama-3-8B

![](https://i.imgur.com/ZHwzQvI.png)

This is a quick fine-tune of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on 1k samples of [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k) created for [this article](https://huggingface.co/blog/mlabonne/orpo-llama-3).

## πŸ† Evaluation

### Nous

Evaluation performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval), see the entire leaderboard [here](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard).

| Model                                                                                                                                                                     |   Average |   AGIEval |   GPT4All | TruthfulQA |  Bigbench |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------: | --------: | --------: | ---------: | --------: |
| [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) [πŸ“„](https://gist.github.com/mlabonne/88b21dd9698ffed75d6163ebdc2f6cc8)     |     52.42 |     42.75 |     72.99 |      52.99 |     40.94 |
| [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) [πŸ“„](https://gist.github.com/mlabonne/8329284d86035e6019edb11eb0933628) |     51.34 |     41.22 |     69.86 |      51.65 |     42.64 |
| [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) [πŸ“„](https://gist.github.com/mlabonne/7a0446c3d30dfce72834ef780491c4b2)   |     49.15 |     33.36 |     67.87 |      55.89 |     39.48 |
| [**mlabonne/OrpoLlama-3-8B**](https://huggingface.co/mlabonne/OrpoLlama-3-8B) [πŸ“„](https://gist.github.com/mlabonne/f41dad371d1781d0434a4672fd6f0b82)                     | **46.76** | **31.56** | **70.19** |  **48.11** | **37.17** |
| [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) [πŸ“„](https://gist.github.com/mlabonne/616b6245137a9cfc4ea80e4c6e55d847)                   |     45.42 |      31.1 |     69.95 |      43.91 |      36.7 |

## πŸ“ˆ Training curves

![](https://i.imgur.com/r78hGrl.png)

## πŸ’» Usage

```python
!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "mlabonne/OrpoLlama-3-8B"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```