Show HN: Serve 100 Large AI models on a single GPU with low impact to TTFT
Show HN (score: 5)Description
With this project you can hot-swap entire large models (32B) on demand.
Its great for:
Serverless AI Inference
Robotics
On Prem deployments
Local Agents
And Its open source.
Let me know if anyone wants to contribute :)
More from Show
Show HN: Matrirc – run irssi in 2026, talk to people on Matrix
Show HN: Matrirc – run irssi in 2026, talk to people on Matrix This solves no real problem — Element works, there's already a Matrix-to-IRC bridge running on half the FOSS networks, and probably nobody under 30 has opened irssi voluntarily this decade.<p>I wrote it anyway because I miss Esc 4 and clunky window-split commands.<p>Matrirc is a local IRC server that speaks Matrix on the back. Point irssi at localhost:6667, log in with Matrix creds, rooms show up as channels.<p>brew tap pawelb0/tap brew install matrirc
No other tools from this source yet.