--- base_model: aiplanet/effi-7b inference: false license: apache-2.0 language: - en library_name: transformers model_type: llama pipeline_tag: text-generation tags: - awq - llama-2 - text-generation-inference --- effi 7b AWQ is a quantized version of effi 7b whiich is a 7 billion parameter model built by AI Planet. We have used Auto-AWQ for quantising the model ## Model Details ### Model Description This original model (Effi-7B) has been fine-tuned on Chain of Thought datasets, which has context from mixed sources with corresponding rationale. The final finetuned Large Language Model(LLM) have shown enhanced capabilities of solving novel tasks by providing a reasoning.And the final model was quantized into AWQ format - **Developed by:** AI Planet - **Model type:** Casual Decoder only - **Language(s) (NLP):** English - **Quantisation type:** AWQ(4-bit) - **License:** Apache 2.0 - **Quantized from model:** Effi-7b ### Qunatization Configuration - **zero_point:** true - **q_group_size:** 128 - **w_bit:** 4 - **version:** "GEMM" - **modules_to_not_convert:** null ### Example of usage ```py import torch from transformers import AutoTokenizer , AutoModelForCausalLM quant_path = "aiplanet/effi-7b-awq" model = AutoModelForCausalLM.from_pretrained(quant_path , device_map='cuda') tokenizer = AutoTokenizer.from_pretrained(quant_path, trust_remote_code=True , safetensors=True , fuse_layers=True) tst = """ ### INSTRUCTION: Virgin Australia, the trading name of Virgin Australia Airlines Pty Ltd, is an Australian-based airline. It is the largest airline by fleet size to use the Virgin brand. It commenced services on 31 August 2000 as Virgin Blue, with two aircraft on a single route. It suddenly found itself as a major airline in Australia's domestic market after the collapse of Ansett Australia in September 2001. The airline has since grown to directly serve 32 cities in Australia, from hubs in Brisbane, Melbourne and Sydney.Is Virgin Australia and Virgin Blue the same airlines? """ system_message = "Given your chain of thought reasoning, provide a rationale for the context in the source." template=f""" Context: {system_message} Human: {tst} """ # Tokenize the input input_ids = tokenizer(template, return_tensors="pt", truncation=True).input_ids.cuda() # Run the model to infere an output outputs = model.generate(input_ids=input_ids, max_new_tokens=512, top_p=0.9,temperature=0.1 , top_k=1, repetition_penalty=1.1) # Print the result print(f"{tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(template):]}") ``` ### Framework versions - **Transformers** 4.37.2 - **Autoawq** 0.1.8 ### Citation ``` @misc {bhavyaaiplanet, author = { {Bhavya Bhola} }, title = { Quantized version of effi-7b by AI Planet}, year = 2024, url = { https://huggingface.co/aiplanet/effi-7b-awq }, publisher = { Hugging Face } } ```