--- datasets: - nvidia/HelpSteer2 pipeline_tag: text-classification --- - **Paper:** Coming soon - **Model:** [URM-LLaMa-3-8B](https://huggingface.co/LxzGordon/URM-LLaMa-3-8B) - Fine-tuned from [FsfairX-LLaMA3-RM-v0.1](https://huggingface.co/sfairXC/FsfairX-LLaMA3-RM-v0.1) # Brief [URM-LLaMa-3-8B](https://huggingface.co/LxzGordon/URM-LLaMa-3-8B) is an uncertain-aware reward model. This RM consists of a base model and an uncertainty-aware and attribute-specific value head. The base model of this RM is from [FsfairX-LLaMA3-RM-v0.1](https://huggingface.co/sfairXC/FsfairX-LLaMA3-RM-v0.1). ## Attribute Regression **Dataset:** [HelpSteer2](https://huggingface.co/datasets/nvidia/HelpSteer2) During training, instead of multi-attributes scores, outputs of the uncertainty-aware value head are parameters of a normal distribution, from which scores are sampled. Then we run regression on the outputs with the labels to train the value head. To enable gradient back-propagation, reparameterization technique is used. We use the five attributes from HelpSteer2: Helpfulness, Correctness, Coherence, Complexity and Verbosity. We use weighted sum to combine these attributes with prior weights ```[0.3, 0.74, 0.46, 0.47,-0.33]``` recommended by [Nemotron-4](https://huggingface.co/nvidia/Nemotron-4-340B-Reward). # Usage ```python import torch from transformers import AutoModelForSequenceClassification, AutoTokenizer model_name = "LxzGordon/URM-LLaMa-3-8B" model = AutoModelForSequenceClassification.from_pretrained( model_name, device_map='auto', trust_remote_code=True, ) tokenizer = AutoTokenizer.from_pretrained(model_name) prompt = "when were the first Olympic Games held?" response1 = "April 1896" response2 = "April 1892" resp1 = [{"role": "user", "content": prompt}, {"role": "assistant", "content": response1}] resp2 = [{"role": "user", "content": prompt}, {"role": "assistant", "content": response2}] # Format and tokenize the conversations resp1 = tokenizer.apply_chat_template(resp1, tokenize=False) resp2 = tokenizer.apply_chat_template(resp2, tokenize=False) resp1 = tokenizer(resp1, return_tensors="pt").to(model.device) resp2 = tokenizer(resp2, return_tensors="pt").to(model.device) with torch.no_grad(): score1 = model(resp1['input_ids'],attention_mask=resp1['attention_mask']).logits[0][0].item() score2 = model(resp2['input_ids'],attention_mask=resp2['attention_mask']).logits[0][0].item() print(score1,score2) # Response 1 score: 3.669522523880005, Response 2 score: 2.5036821365356445 ``` # Reference Coming soon