NeMo
PyTorch
English
nemotron
suhara commited on
Commit
0a62d02
1 Parent(s): d33b3ee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -9,11 +9,11 @@ library_name: nemo
9
 
10
  ## Model Overview
11
 
12
- Nemotron-Mini-4B-Instruct is a model for generating responses for roleplaying, retrieval augmented generation, and function calling. It is a small language model (SLM) optimized through distillation, pruning and quantization for speed and on-device deployment. It is a fine-tuned version of [nvidia/Minitron-4B-Base](https://huggingface.co/nvidia/Minitron-4B-Base), which was pruned and distilled from [Nemotron-4 15B](https://arxiv.org/abs/2402.16819) using [our compression technique](https://arxiv.org/abs/2407.14679). This instruct model is optimized for roleplay, RAG QA, and function calling in English. It supports a context length of 4,096 tokens. This model is ready for commercial use.
13
 
14
  Try this model on [build.nvidia.com](https://build.nvidia.com/nvidia/nemotron-mini-4b-instruct).
15
 
16
- For more details about how this model is used for [NVIDIA ACE](https://developer.nvidia.com/ace), please refer to [this blog post](https://developer.nvidia.com/blog/deploy-the-first-on-device-small-language-model-for-improved-game-character-roleplay/) and [this demo video](https://www.youtube.com/watch?v=d5z7oIXhVqg) showcases how the SLM works for gaming use cases. You can also download the model checkpoint for NVIDIA AI Inference Manager (AIM) SDK from [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ucs-ms/resources/nemotron-mini-4b-instruct).
17
 
18
  **Model Developer:** NVIDIA
19
 
 
9
 
10
  ## Model Overview
11
 
12
+ Nemotron-Mini-4B-Instruct is a model for generating responses for roleplaying, retrieval augmented generation, and function calling. It is a small language model (SLM) optimized through distillation, pruning and quantization for speed and on-device deployment. It is a fine-tuned version of [nvidia/Minitron-4B-Base](https://huggingface.co/nvidia/Minitron-4B-Base), which was pruned and distilled from [Nemotron-4 15B](https://arxiv.org/abs/2402.16819) using [our LLM compression technique](https://arxiv.org/abs/2407.14679). This instruct model is optimized for roleplay, RAG QA, and function calling in English. It supports a context length of 4,096 tokens. This model is ready for commercial use.
13
 
14
  Try this model on [build.nvidia.com](https://build.nvidia.com/nvidia/nemotron-mini-4b-instruct).
15
 
16
+ For more details about how this model is used for [NVIDIA ACE](https://developer.nvidia.com/ace), please refer to [this blog post](https://developer.nvidia.com/blog/deploy-the-first-on-device-small-language-model-for-improved-game-character-roleplay/) and [this demo video](https://www.youtube.com/watch?v=d5z7oIXhVqg), which showcases how the model can be integrated into a video game. You can download the model checkpoint for NVIDIA AI Inference Manager (AIM) SDK from [here](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ucs-ms/resources/nemotron-mini-4b-instruct).
17
 
18
  **Model Developer:** NVIDIA
19