Rallio67 commited on
Commit
dcc17ee
1 Parent(s): c018680

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -16,8 +16,7 @@ https://huggingface.co/alpindale/magnum-72b-v1
16
  * <h3 style="display: inline;">Release Date:</h3> June 25, 2024
17
 
18
  Magnum-72B-v1 quantized to FP8 weights and activations using per-tensor quantization through the [AutoFP8 repository](https://github.com/neuralmagic/AutoFP8), ready for inference with vLLM >= 0.5.0.
19
- Calibrated with 512 UltraChat samples to achieve 100% performance recovery on the Open LLM Benchmark evaluations.
20
- Reduces space on disk by ~45%.
21
  Part of the [FP8 LLMs for vLLM collection](https://huggingface.co/collections/neuralmagic/fp8-llms-for-vllm-666742ed2b78b7ac8df13127).
22
 
23
  ## Usage and Creation
 
16
  * <h3 style="display: inline;">Release Date:</h3> June 25, 2024
17
 
18
  Magnum-72B-v1 quantized to FP8 weights and activations using per-tensor quantization through the [AutoFP8 repository](https://github.com/neuralmagic/AutoFP8), ready for inference with vLLM >= 0.5.0.
19
+ Calibrated with 512 UltraChat samples to achieve better performance recovery.
 
20
  Part of the [FP8 LLMs for vLLM collection](https://huggingface.co/collections/neuralmagic/fp8-llms-for-vllm-666742ed2b78b7ac8df13127).
21
 
22
  ## Usage and Creation