Project InterACT

This model is a part of Project InterACT (Multi model AI system) involving an object detection model and an LLM

This is a model built by finetuning the Llama-2-7b-chat model on custom dataset: Jithendra-k/InterACT_LLM.

Points to consider for Finetuning Llama-2_7B_chat model:
=> Free Google Colab offers a 15GB Graphics Card (Limited Resources --> Barely enough to store Llama 2–7b’s weights)
=> We also considered the overhead due to optimizer states, gradients, and forward activations
=> Full fine-tuning is not possible in our case due to computation: we used parameter-efficient fine-tuning (PEFT) techniques like LoRA or QLoRA.
=> To drastically reduce the VRAM usage, we fine-tuned the model in 4-bit precision, which is why we've used QLoRA technique.
=> We only trained with 5 epochs considering our computation, time and early stopping.

Here are some plots of model performance during training:

Here is an Example Input/Output:

Code to finetune a Llama-2_7B_chat model: Google_Colab_file

Ethical Considerations and Limitations

Llama 2 is a new technology that carries risks with use. Testing conducted to date has been in English, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, Llama 2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 2, developers should perform safety testing and tuning tailored to their specific applications of the model.

Please see the Responsible Use Guide available at https://ai.meta.com/llama/responsible-use-guide/

Reporting Issues

Please report any software “bug,” or other problems with the models through one of the following means:

Reporting issues with the model: github.com/facebookresearch/llama
Reporting problematic content generated by the model: developers.facebook.com/llama_output_feedback
Reporting bugs and security concerns: facebook.com/whitehat/info

Credits and Thanks:

Greatest thanks to NousResearch/Llama-2-70b-chat-hf and meta for enabling us to use the Llama-2-70b-chat-hf model.

https://huggingface.co/NousResearch/Llama-2-70b-chat-hf

https://huggingface.co/meta-llama/Llama-2-7b-chat-hf

Hugo Touvron, Thomas Scialom, et al. (2023). Llama 2: Open Foundation and Fine-Tuned Chat Models.

Philipp Schmid, Omar Sanseviero, Pedro Cuenca, & Lewis Tunstall. Llama 2 is here - get it on Hugging Face. https://huggingface.co/blog/llama2

Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, & Luke Zettlemoyer. (2023). QLoRA: Efficient Finetuning of Quantized LLMs.