File size: 9,826 Bytes
b4b58f0
1bd57a0
 
e22d508
 
1bd57a0
 
 
 
 
e22d508
 
 
 
 
1bd57a0
 
 
e22d508
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42cd355
 
 
 
 
cf59add
941f7de
3dc4614
 
8c9aac4
941f7de
cf59add
941f7de
42cd355
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
941f7de
42cd355
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dceb9d4
42cd355
941f7de
 
 
 
 
 
 
 
 
 
 
d359a0b
941f7de
 
 
 
 
 
 
 
 
 
 
 
42cd355
 
 
 
 
 
 
 
 
 
dceb9d4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
941f7de
dceb9d4
7988deb
dceb9d4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e22d508
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
---
language:
- en
license: mit
library_name: transformers
tags:
- orpo
- qwen2
- sft
- chatml
base_model:
- MaziyarPanahi/calme-2.4-rys-78b
datasets:
- mlabonne/orpo-dpo-mix-40k
pipeline_tag: text-generation
inference: false
model_creator: dfurman
quantized_by: dfurman
model-index:
- name: CalmeRys-78B-Orpo-v0.1
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: IFEval (0-Shot)
      type: HuggingFaceH4/ifeval
      args:
        num_few_shot: 0
    metrics:
    - type: inst_level_strict_acc and prompt_level_strict_acc
      value: 81.63
      name: strict accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=dfurman/CalmeRys-78B-Orpo-v0.1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: BBH (3-Shot)
      type: BBH
      args:
        num_few_shot: 3
    metrics:
    - type: acc_norm
      value: 61.92
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=dfurman/CalmeRys-78B-Orpo-v0.1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MATH Lvl 5 (4-Shot)
      type: hendrycks/competition_math
      args:
        num_few_shot: 4
    metrics:
    - type: exact_match
      value: 37.92
      name: exact match
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=dfurman/CalmeRys-78B-Orpo-v0.1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GPQA (0-shot)
      type: Idavidrein/gpqa
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 20.02
      name: acc_norm
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=dfurman/CalmeRys-78B-Orpo-v0.1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MuSR (0-shot)
      type: TAUR-Lab/MuSR
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 36.37
      name: acc_norm
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=dfurman/CalmeRys-78B-Orpo-v0.1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU-PRO (5-shot)
      type: TIGER-Lab/MMLU-Pro
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 66.8
      name: accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=dfurman/CalmeRys-78B-Orpo-v0.1
      name: Open LLM Leaderboard
---


# dfurman/CalmeRys-78B-Orpo-v0.1

This model is a finetune of `MaziyarPanahi/calme-2.4-rys-78b` on 1.5k rows of the `mlabonne/orpo-dpo-mix-40k` dataset. It was trained as a generalist language model for a variety of text generation use cases, including support of agentic capabilities, roleplaying, reasoning, multi-turn conversations, long context coherence, and more.

Thanks go out to [mlabonne](https://huggingface.co/mlabonne), [MaziyarPanahi](https://huggingface.com/MaziyarPanahi), et al. for the source dataset and base model.

## 🦾 Training

You can find the experiment on W&B at this [link](https://wandb.ai/dryanfurman/huggingface/runs/1w50nu70?nw=nwuserdryanfurman). Here are a few visualizations:

![image/png](https://cdn-uploads.huggingface.co/production/uploads/62afc20ca5bd7cef3e1ab3f4/NG5WGL0ljzLsNhSBRVqnD.png)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/62afc20ca5bd7cef3e1ab3f4/Zhk5Bpr1I2NrzX98Bhtp8.png)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/62afc20ca5bd7cef3e1ab3f4/WgnKQnYIFWkCRSW3JPVAb.png)


## 💻 Usage

<details>

<summary>Setup</summary>

```python
!pip install -qU transformers accelerate bitsandbytes
!huggingface-cli download dfurman/CalmeRys-78B-Orpo-v0.1
```

```python
from transformers import AutoTokenizer, BitsAndBytesConfig
import transformers
import torch


if torch.cuda.get_device_capability()[0] >= 8:
    !pip install -qqq flash-attn
    attn_implementation = "flash_attention_2"
    torch_dtype = torch.bfloat16
else:
    attn_implementation = "eager"
    torch_dtype = torch.float16

# # quantize if necessary
# bnb_config = BitsAndBytesConfig(
#    load_in_4bit=True,
#    bnb_4bit_quant_type="nf4",
#    bnb_4bit_compute_dtype=torch_dtype,
#    bnb_4bit_use_double_quant=True,
# )

model = "dfurman/CalmeRys-78B-Orpo-v0.1"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={
        "torch_dtype": torch_dtype,
        # "quantization_config": bnb_config,
        "device_map": "auto",
        "attn_implementation": attn_implementation,
    }
)
```

</details>

### Example 1

```python
question = "Is the number 9.11 larger than 9.9?"

messages = [
    {"role": "system", "content": "You are a helpful assistant that thinks step by step."},
    {"role": "user", "content": question},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# print("***Prompt:\n", prompt)

outputs = pipeline(
    prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95
)
print("***Generation:")
print(outputs[0]["generated_text"][len(prompt) :])
```

```
***Generation:
To compare these two numbers, it's important to look at their decimal places after the whole number part, which is 9 in both cases. Comparing the tenths place, 9.11 has a '1' and 9.9 has a '9'. Since '9' is greater than '1', 9.9 is larger than 9.11.
```

### Example 2

```python
question = """The bakers at the Beverly Hills Bakery baked 200 loaves of bread on Monday morning. 
They sold 93 loaves in the morning and 39 loaves in the afternoon. 
A grocery store then returned 6 unsold loaves back to the bakery. 
How many loaves of bread did the bakery have left?
Respond as succinctly as possible. Format the response as a completion of this table:
|step|subquestion|procedure|result|
|:---|:----------|:--------|:-----:|"""


messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": question},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# print("***Prompt:\n", prompt)

outputs = pipeline(prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print("***Generation:")
print(outputs[0]["generated_text"][len(prompt):])

```

```
***Generation:
|1|Calculate total sold|Add morning and afternoon sales|132|
|2|Subtract sold from total|200 - 132|68|
|3|Adjust for returns|Add returned loaves to remaining|74|
```

### Example 3

```python
question = "What's a good recipe for a spicy margarita?"

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": question},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# print("***Prompt:\n", prompt)

outputs = pipeline(prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print("***Generation:")
print(outputs[0]["generated_text"][len(prompt):])
```

```
***Generation:
To make a Spicy Margarita, you'll need to incorporate a chili or pepper element into your classic margarita recipe. Here’s a simple way to do it:

### Ingredients:
- 2 oz tequila (blanco or reposado)
- 1 oz fresh lime juice
- 1/2 oz triple sec (Cointreau or Grand Marnier)
- 1/2 oz agave syrup or simple syrup
- 1-2 slices of jalapeño (or more depending on how spicy you like it)
- Salt and/or chili powder for rimming the glass
- Ice
- Lime wheel for garnish

### Instructions:
1. **Muddle Jalapeño**: In a shaker, muddle the jalapeño slices slightly. This will release the oils and heat from the peppers.
2. **Add Remaining Ingredients**: Add the tequila, lime juice, triple sec, and agave syrup or simple syrup. 
3. **Shake and Strain**: Fill the shaker with ice and shake vigorously until cold. Strain into a salt and/or chili powder rimmed glass filled with ice.
4. **Garnish and Serve**: Garnish with a lime wheel and enjoy.

If you prefer a smoother spiciness that doesn't overpower the drink, you could also consider making a jalapeño-infused tequila by leaving the jalapeño slices in the bottle of tequila for several hours to a couple of days, adjusting the time based on desired level of spiciness. Then use this infused tequila instead of regular tequila in the recipe above. 

Another variation is to use a spicy syrup. To make this, combine equal parts water and sugar with a few sliced jalapeños in a saucepan. Bring to a boil, stirring occasionally to dissolve the sugar. Reduce heat and simmer for about 5 minutes. Let cool, strain out the jalapeños, then store in a sealed container in the refrigerator until ready to use. Use this spicy syrup instead of regular syrup in the recipe. 

As always, adjust the quantity of jalapeño or the type of chili used to suit your taste. Enjoy responsibly!
```


# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_dfurman__CalmeRys-78B-Orpo-v0.1)

|      Metric       |Value|
|-------------------|----:|
|Avg.               |50.78|
|IFEval (0-Shot)    |81.63|
|BBH (3-Shot)       |61.92|
|MATH Lvl 5 (4-Shot)|37.92|
|GPQA (0-shot)      |20.02|
|MuSR (0-shot)      |36.37|
|MMLU-PRO (5-shot)  |66.80|