A chat, running entirely in your browser.
No server, no API key, no data leaving this tab. The model downloads once, caches in your browser, then runs on your GPU via WebGPU.
Pick a size above. E2B is faster to download and run; E4B is noticeably smarter. Click Load model in the composer, wait for the cache to fill, then chat.