@Xenova on Hugging Face: "Introducing Phi-3 WebGPU, a private and powerful AI chatbot that runs 100%…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

Xenova

posted an update May 9

Post

11153

Introducing Phi-3 WebGPU, a private and powerful AI chatbot that runs 100% locally in your browser, powered by 🤗 Transformers.js and onnxruntime-web!

🔒 On-device inference: no data sent to a server
⚡️ WebGPU-accelerated (> 20 t/s)
📥 Model downloaded once and cached

Try it out: Xenova/experimental-phi3-webgpu

radames

May 9

Amazing!! Shall we make a VB node for this?

Xenova

May 9

The model might be a bit large, but it could be something to try!

osanseviero

May 9

This is so cool!

ucalyptus

May 13

how do u obtain the wasm file? Didn't find it here: https://cdn.jsdelivr.net/npm/@xenova/transformers@2.17.1/dist/

cc: @Xenova

theguru666

Jun 9

This is really cool! Performance is really good. I am running this on Chrome and Chrome unstable on Arch Linux with a RTX 3050 with 4GB ram on a Dell XPS 17. Unfortunately, inference starts super fast, but after a few sentences, I get what looks like a vulkan memory error:

vkAllocateMemory failed with VK_ERROR_OUT_OF_DEVICE_MEMORY

From that point on, the streaming only returns garbage. Will investigate further and see if I can get this running without crashing.

In this post