--- language: - en license: other tags: - axolotl - instruct - finetune - chatml - gpt4 - synthetic data - science - physics - chemistry - biology - math - qwen - qwen2 base_model: Weyaxi/Einstein-v7-Qwen2-7B datasets: - allenai/ai2_arc - camel-ai/physics - camel-ai/chemistry - camel-ai/biology - camel-ai/math - metaeval/reclor - openbookqa - mandyyyyii/scibench - derek-thomas/ScienceQA - TIGER-Lab/ScienceEval - jondurbin/airoboros-3.2 - LDJnr/Capybara - Cot-Alpaca-GPT4-From-OpenHermes-2.5 - STEM-AI-mtl/Electrical-engineering - knowrohit07/saraswati-stem - sablo/oasst2_curated - lmsys/lmsys-chat-1m - TIGER-Lab/MathInstruct - bigbio/med_qa - meta-math/MetaMathQA-40K - openbookqa - piqa - metaeval/reclor - derek-thomas/ScienceQA - scibench - sciq - Open-Orca/SlimOrca - migtissera/Synthia-v1.3 - TIGER-Lab/ScienceEval - allenai/WildChat - microsoft/orca-math-word-problems-200k - openchat/openchat_sharegpt4_dataset - teknium/GPTeacher-General-Instruct - m-a-p/CodeFeedback-Filtered-Instruction - totally-not-an-llm/EverythingLM-data-V3 - HuggingFaceH4/no_robots - OpenAssistant/oasst_top1_2023-08-25 - WizardLM/WizardLM_evol_instruct_70k - abacusai/SystemChat-1.1 - H-D-T/Buzz-V1.2 pipeline_tag: text-generation --- # 🔬 Einstein-v7-Qwen2-7B-GGUF This is quantized version of [Weyaxi/Einstein-v7-Qwen2-7B](https://huggingface.co/Weyaxi/Einstein-v7-Qwen2-7B) created using llama.cpp # Model Description ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/KLQP1jK-DIzpwHzYRIH-Q.png) This model is a full fine-tuned version of [Qwen/Qwen2-7B](https://huggingface.co/Qwen/Qwen2-7B) on diverse datasets. This model is finetuned using `8xMI300X` using [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).
See axolotl config axolotl version: `0.4.0` ```yaml base_model: Qwen/Qwen2-7B model_type: AutoModelForCausalLM tokenizer_type: AutoTokenizer load_in_8bit: false load_in_4bit: false strict: false chat_template: chatml datasets: - path: data/airoboros_3.2_without_contextual_slimorca_orca_sharegpt.json ds_type: json type: sharegpt conversation: chatml - path: data/allenai_wild_chat_gpt4_english_toxic_random_half_4k_sharegpt.json ds_type: json type: sharegpt strict: false conversation: chatml - path: data/buzz_unstacked_chosen_math_removed_filtered.json ds_type: json type: alpaca conversation: chatml - path: data/capybara_sharegpt.json ds_type: json type: sharegpt conversation: chatml - path: data/cot_alpaca_gpt4_extracted_openhermes_2.5_sharegpt.json ds_type: json type: sharegpt conversation: chatml - path: data/everythinglm-data-v3_sharegpt.json ds_type: json type: sharegpt strict: false conversation: chatml - path: data/gpt4_data_lmys_1m_sharegpt.json ds_type: json type: sharegpt conversation: chatml - path: data/gpteacher-instruct-special-alpaca.json ds_type: json type: gpteacher conversation: chatml - path: data/merged_all.json ds_type: json type: alpaca conversation: chatml - path: data/no_robots_sharegpt.json ds_type: json type: sharegpt strict: false conversation: chatml - path: data/oasst_top1_from_fusechatmixture_sharegpt.json ds_type: json type: sharegpt strict: false conversation: chatml - path: data/pippa_bagel_repo_3k_sharegpt.json ds_type: json type: sharegpt conversation: chatml - path: data/rpguild_quarter_alignment_lab_sharegpt.json ds_type: json type: sharegpt conversation: chatml - path: data/sharegpt_gpt4_english.json ds_type: json type: sharegpt conversation: chatml - path: data/slimorca_dedup_filtered_95k_sharegpt.json ds_type: json type: sharegpt conversation: chatml - path: data/soda_diaolog_longest_tenth_buzz_sharegpt.json ds_type: json type: sharegpt conversation: chatml - path: data/synthia-v1.3_sharegpt_12500.json ds_type: json type: sharegpt conversation: chatml - path: data/system_conversations_dolphin_sharegpt.json ds_type: json type: sharegpt conversation: chatml dataset_prepared_path: last_run_prepared val_set_size: 0.002 output_dir: ./Einstein-v7-Qwen2-7B-model sequence_len: 8192 sample_packing: true pad_to_sequence_len: true eval_sample_packing: false wandb_project: Einstein wandb_entity: wandb_watch: wandb_name: wandb_log_model: hub_model_id: Weyaxi/Einstein-v7-Qwen2-7B gradient_accumulation_steps: 4 micro_batch_size: 6 num_epochs: 2 optimizer: paged_adamw_8bit lr_scheduler: cosine learning_rate: 0.00001 # look train_on_inputs: false group_by_length: false bf16: auto fp16: tf32: false gradient_checkpointing: unsloth gradient_checkpointing_kwargs: use_reentrant: true # look early_stopping_patience: resume_from_checkpoint: local_rank: logging_steps: 1 xformers_attention: flash_attention: true warmup_steps: 10 evals_per_epoch: 2 eval_table_size: eval_max_new_tokens: 128 saves_per_epoch: 1 debug: deepspeed: deepspeed_configs/zero3_bf16.json weight_decay: 0.05 fsdp: fsdp_config: special_tokens: eos_token: "<|im_end|>" pad_token: "<|end_of_text|>" tokens: - "<|im_start|>" - "<|im_end|>" ```

# 💬 Prompt Template You can use ChatML prompt template while using the model: ### ChatML ``` <|im_start|>system {system}<|im_end|> <|im_start|>user {user}<|im_end|> <|im_start|>assistant {asistant}<|im_end|> ``` This prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the `tokenizer.apply_chat_template()` method: ```python messages = [ {"role": "system", "content": "You are helpful AI asistant."}, {"role": "user", "content": "Hello!"} ] gen_input = tokenizer.apply_chat_template(message, return_tensors="pt") model.generate(**gen_input) ``` # 📊 Datasets used in this model The datasets used to train this model are listed in the metadata section of the model card. Please note that certain datasets mentioned in the metadata may have undergone filtering based on various criteria. The results of this filtering process and its outcomes are in a diffrent repository: [Weyaxi/sci-datasets/main](https://huggingface.co/datasets/Weyaxi/sci-datasets/tree/main) # 🎯 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) # 🤖 Additional information about training This model is full fine-tuned for 2 epoch. Total number of steps was 500.
Loss graph ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/bkJGgh_JUfKeRlTLo_ZcB.png)