Plans for 200K?

#1
by adamo1139 - opened

Hello,

Thank you for releasing 32k context version of Yi-1.5-34B.

Do you have any plans for training and releasing Yi-1.5-34B-200K?

01-ai org

Not on our list as of now.

lorinma changed discussion status to closed

@lorinma is there any chance you can talk about the reasons behind that? Maybe something you learned through doing the 200k version for Yi 1?

01-ai org

@lorinma is there any chance you can talk about the reasons behind that? Maybe something you learned through doing the 200k version for Yi 1?

Simply because we are focusing our current computing resources on bigger, better, and probably not just language models. For instance, our proprietary yi-large model we trained over the last couple of months, is in the chatbot arena right now. Hopefully, we get nice results soon enough.

We are proud of our original Yi-200k models and are truly flattered to see so many amazing community finetunes like Nous-Capybara-34B and Faro-yi-9b. It's also thrilling to witness many research works like RULER by Nvidia and Fu Yao's recent note (https://yaofu.notion.site/Full-Stack-Transformer-Inference-Optimization-Season-2-Deploying-Long-Context-Models-ee25d3a77ba14f73b8ae19147f77d5e2). Additionally, the work on llama3-8B with a 1M context length by Gradient AI continues to inspire us.

Better and longer context LLMs still remain an actively studied area. We're taking our time to thoroughly think this through before diving into any new training cycles. We want to avoid making any premature promises to the wonderful Yi community. However, rest assured, we'll definitely keep you updated with any exciting developments on our end.

Big fan of your original Yi models, looking forward to what's next thanks. Hope to see your company succeed.

01-ai org

Big fan of your original Yi models, looking forward to what's next thanks. Hope to see your company succeed.

Thank you for your support! Love from Yi team.

Sign up or log in to comment