See our paper at https://huggingface.co/papers/2405.19332.
Shenao Zhang
ZhangShenao
AI & ML interests
None yet
Organizations
Collections
3
-
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-3
Text Generation • Updated • 129 • 4 -
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-2
Text Generation • Updated • 25 -
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-1
Text Generation • Updated • 24 -
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
Paper • 2405.19332 • Published • 15
Papers
1
models
11
ZhangShenao/zephyr-7b-beta-rpo-full
Updated
ZhangShenao/SELM-Phi-3-mini-4k-instruct-iter-1
Text Generation
•
Updated
•
13
ZhangShenao/SELM-Phi-3-mini-4k-instruct-iter-2
Text Generation
•
Updated
•
11
ZhangShenao/SELM-Phi-3-mini-4k-instruct-iter-3
Text Generation
•
Updated
•
15
•
1
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-1
Text Generation
•
Updated
•
24
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-2
Text Generation
•
Updated
•
25
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-3
Text Generation
•
Updated
•
129
•
4
ZhangShenao/DPO-Zephyr-7B
Text Generation
•
Updated
•
5
ZhangShenao/SELM-Zephyr-7B-iter-1
Text Generation
•
Updated
•
10
ZhangShenao/SELM-Zephyr-7B-iter-2
Text Generation
•
Updated
•
8
datasets
35
ZhangShenao/ultrafeedback_binarized_prompts
Viewer
•
Updated
•
61.1k
•
25
ZhangShenao/Gemma-relabel-dpo
Viewer
•
Updated
•
122k
•
2
ZhangShenao/Gemma-relabel
Viewer
•
Updated
•
122k
•
2
ZhangShenao/Qwen-relabel-dpo
Viewer
•
Updated
•
122k
•
121
ZhangShenao/gcbinarized_posonly_ultrafeedback
Viewer
•
Updated
•
49.6k
•
2
ZhangShenao/gcbinarized_ultrafeedback_nosys
Viewer
•
Updated
•
97.1k
•
594
ZhangShenao/gcmode_fine_ultrafeedback
Viewer
•
Updated
•
97.1k
•
2
ZhangShenao/gcbinarized_fine_ultrafeedback
Viewer
•
Updated
•
97.1k
•
1.22k
ZhangShenao/newbin_ultrafeedback
Viewer
•
Updated
•
124k
•
2
ZhangShenao/gc_fine_ultrafeedback_nosys_noinst
Viewer
•
Updated
•
97.1k
•
2