thu-coai
/

ShieldLM-14B-qwen

Text Generation

Model card Files Files and versions Community

ShieldLM-14B-qwen / README.md

nonstopfor's picture

Update README.md

afe54ca verified 7 months ago

|

history blame contribute delete

No virus

978 Bytes

	---
	license: mit
	language:
	- en
	- zh
	---
	## Introduction
	The ShieldLM model ([paper link](https://arxiv.org/abs/2402.16444)) initialized from [Qwen-14B-Chat](https://huggingface.co/Qwen/Qwen-14B-Chat). ShieldLM is a bilingual (Chinese and English) safety detector that mainly aims to help to detect safety issues in LLMs' generations. It aligns with general human safety standards, supports fine-grained customizable detection rules, and provides explanations for its decisions.
	Refer to our [github repository](https://github.com/thu-coai/ShieldLM) for more detailed information.

	## Usage
	Please refer to our [github repository](https://github.com/thu-coai/ShieldLM) for the detailed usage instructions.

	## Performance
	ShieldLM demonstrates impressive detection performance across 4 ID and OOD test sets, compared to strong baselines such as GPT-4, Llama Guard and Perspective API.
	Refer to [our paper](https://arxiv.org/abs/2402.16444) for more detailed evaluation results.