Hello, can you elaborate on these conditional behavior cloning and weighted behavior cloning?

#1
by teknium - opened
OpenChat org

What are they? I looked the terms up on Google and found nothing.

If it's RLHF, what differentiates the two methods? Thanks

Thanks for your interest. In short, we simply use different prompts like "Assistant GPT3.5" and "Assistant GPT4". We are preparing a paper to elaborate on our technical report.

Thanks for your interest. In short, we simply use different prompts like "Assistant GPT3.5" and "Assistant GPT4". We are preparing a paper to elaborate on our technical report.

Will it be a significant performance drop if not using conditional behavior cloning, i.e., all 80K samples with a uniform "Assistant:" prompt?

OpenChat org

Yes. This may have the same performance as Vicuna.

Sign up or log in to comment