--- language: - en license: apache-2.0 tags: - chat pipeline_tag: text-generation model-index: - name: Qwen2-7B-Instruct-abliterated results: - task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: HuggingFaceH4/ifeval args: num_few_shot: 0 metrics: - type: inst_level_strict_acc and prompt_level_strict_acc value: 58.37 name: strict accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=natong19/Qwen2-7B-Instruct-abliterated name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: BBH args: num_few_shot: 3 metrics: - type: acc_norm value: 37.75 name: normalized accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=natong19/Qwen2-7B-Instruct-abliterated name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: hendrycks/competition_math args: num_few_shot: 4 metrics: - type: exact_match value: 10.27 name: exact match source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=natong19/Qwen2-7B-Instruct-abliterated name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa args: num_few_shot: 0 metrics: - type: acc_norm value: 6.82 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=natong19/Qwen2-7B-Instruct-abliterated name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics: - type: acc_norm value: 8.93 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=natong19/Qwen2-7B-Instruct-abliterated name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 31.58 name: accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=natong19/Qwen2-7B-Instruct-abliterated name: Open LLM Leaderboard --- # Qwen2-7B-Instruct-abliterated ## Introduction Abliterated version of [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) using [failspy](https://huggingface.co/failspy)'s notebook. The model's strongest refusal directions have been ablated via weight orthogonalization, but the model may still refuse your request, misunderstand your intent, or provide unsolicited advice regarding ethics or safety. ## Quickstart ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "natong19/Qwen2-7B-Instruct-abliterated" device = "cuda" # the device to load the model onto model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_id) prompt = "Give me a short introduction to large language model." messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(device) generated_ids = model.generate( model_inputs.input_ids, max_new_tokens=256 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(response) ``` ## Evaluation Evaluation framework: lm-evaluation-harness 0.4.2 | Datasets | Qwen2-7B-Instruct | Qwen2-7B-Instruct-abliterated | | :--- | :---: | :---: | | ARC (25-shot) | 62.5 | 62.5 | | GSM8K (5-shot) | 73.0 | 72.2 | | HellaSwag (10-shot) | 81.8 | 81.7 | | MMLU (5-shot) | 70.7 | 70.5 | | TruthfulQA (0-shot) | 57.3 | 55.0 | | Winogrande (5-shot) | 76.2 | 77.4 | # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_natong19__Qwen2-7B-Instruct-abliterated) | Metric |Value| |-------------------|----:| |Avg. |25.62| |IFEval (0-Shot) |58.37| |BBH (3-Shot) |37.75| |MATH Lvl 5 (4-Shot)|10.27| |GPQA (0-shot) | 6.82| |MuSR (0-shot) | 8.93| |MMLU-PRO (5-shot) |31.58|