Pankaj Mathur commited on
Commit
589c009
1 Parent(s): 15cdc0d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -20,6 +20,10 @@ Please note this model has *better code generation capabilities* compare to our
20
 
21
  # Evaluation
22
 
 
 
 
 
23
  |||||||
24
  |:------:|:-------------:|:---------:|:--------:|:-------:|:--------:|
25
  |**Task**|**num_fewshot**|**Version**|**Metric**|**Value**|**Stderr**|
@@ -30,6 +34,7 @@ Please note this model has *better code generation capabilities* compare to our
30
  |*truthfulqa_mc*|0|1|mc1|0.2938|0.0159|
31
  |*truthfulqa_mc*|0|1|mc2|0.4399|0.0153|
32
 
 
33
  # Dataset
34
 
35
  We used [remove_refusals.py](https://huggingface.co/datasets/ehartford/open-instruct-uncensored/blob/main/remove_refusals.py) script from https://huggingface.co/ehartford.
 
20
 
21
  # Evaluation
22
 
23
+ I evaluated orca_mini_v2_7b on a wide range of tasks using [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) from EleutherAI.
24
+
25
+ Here are the results, please note num_fewshots for each task.
26
+
27
  |||||||
28
  |:------:|:-------------:|:---------:|:--------:|:-------:|:--------:|
29
  |**Task**|**num_fewshot**|**Version**|**Metric**|**Value**|**Stderr**|
 
34
  |*truthfulqa_mc*|0|1|mc1|0.2938|0.0159|
35
  |*truthfulqa_mc*|0|1|mc2|0.4399|0.0153|
36
 
37
+
38
  # Dataset
39
 
40
  We used [remove_refusals.py](https://huggingface.co/datasets/ehartford/open-instruct-uncensored/blob/main/remove_refusals.py) script from https://huggingface.co/ehartford.