cr-model / README.md
TwT-6's picture
Adding Evaluation Results (#1)
956faec verified
---
license: cc-by-nc-4.0
model-index:
- name: cr-model
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 57.85
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TwT-6/cr-model
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 81.66
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TwT-6/cr-model
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 68.73
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TwT-6/cr-model
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 58.2
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TwT-6/cr-model
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 76.24
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TwT-6/cr-model
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 65.88
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TwT-6/cr-model
name: Open LLM Leaderboard
---
My model is a state-of-the-art language processing AI designed to understand and generate human-like text. It leverages deep learning algorithms to engage in a wide range of language tasks, providing users with information, recommendations, and even casual conversation. With a broad knowledge base and nuanced understanding of context, my capabilities enable me to assist with various inquiries and perform complex language-based tasks effectively.
How to use?
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.generation import GenerationConfig
import torch
model = AutoModelForCausalLM.from_pretrained(
'TwT-6/cr-model',
attn_implementation="flash_attention_2",
trust_remote_code=True, torch_dtype=torch.bfloat16, device_map="auto").eval()
tokenizer = AutoTokenizer.from_pretrained('TwT-6/cr-model', trust_remote_code=True)
inputs = '你好'
inputs = f'<|omni_start|>### User:\n{inputs}\n\n### Assistant:\n'
inputs = tokenizer(inputs, return_tensors="pt").to('cuda')
output_ids = model.generate(**inputs)[0].cpu()
output = tokenizer.decode(output_ids[inputs.input_ids.shape[-1]:])
print(output)
## 你好!很高兴见到你。有什么我可以帮助你的吗
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_TwT-6__cr-model)
| Metric |Value|
|---------------------------------|----:|
|Avg. |68.09|
|AI2 Reasoning Challenge (25-Shot)|57.85|
|HellaSwag (10-Shot) |81.66|
|MMLU (5-Shot) |68.73|
|TruthfulQA (0-shot) |58.20|
|Winogrande (5-shot) |76.24|
|GSM8k (5-shot) |65.88|