dnhkng
/

RYS-XLarge

@@ -99,15 +99,15 @@ model-index:
 This is a new kind of model optimization. It is based on a new method for the analysis of the functional role of layers within the transformer stack, and on layer duplication (self-merging) to increase intelligence.
-*No Weights were modified in this process!*
-### Model improvement (%) with layer duplication:
 |                 | Average | IFEval | BBH  | MATH Lvl 5 | GPQA | MUSR  | MMLU-PRO |
 |-----------------|---------|--------|------|------------|------|-------|----------|
-| RYS Improvement |    2.61 |  -2.05 | 2.51 |       8.16 | 2.58 | 17.72 |     0.31 |
-This model is based on MaziyarPanahi/calme-2.1-qwen2-72b, which was tuned from Qwen2-72B. As this method is orthogonal to fine-tuning, the further finetune from MaziyarPanahi now has the top position:
 https://huggingface.co/MaziyarPanahi/calme-2.4-rys-78b
@@ -117,7 +117,7 @@ This research was supported with hardware from the [appliedAI Institute](https:/
 ## Quickstart
-Here is a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer

 This is a new kind of model optimization. It is based on a new method for the analysis of the functional role of layers within the transformer stack, and on layer duplication (self-merging) to increase intelligence.
+### No Weights were modified in this process!
+### Model improvement with layer duplication:
 |                 | Average | IFEval | BBH  | MATH Lvl 5 | GPQA | MUSR  | MMLU-PRO |
 |-----------------|---------|--------|------|------------|------|-------|----------|
+| RYS Improvement |   2.61% | -2.05% |2.51% |      8.16% |2.58% |17.72% |    0.31% |
+This model is based on MaziyarPanahi/calme-2.1-qwen2-72b, which in turn was tuned from Qwen2-72B. As this method is orthogonal to fine-tuning, the further finetune from MaziyarPanahi now has the top position:
 https://huggingface.co/MaziyarPanahi/calme-2.4-rys-78b
 ## Quickstart
+Here is a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate content.
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer