Show HN: TTSLab â A voice AI agent and TTS lab running in the browser via WebGPU
Show HN (score: 5)Description
No API keys, no backend, no data leaves your machine.
When you open the site, you'll hear it immediately â the landing page auto-generates speech from three different sentences right in your browser, no setup required.
You can then try any model yourself: type text, hit generate, hear it instantly. Models download once and get cached locally.
The most experimental feature: a fully in-browser Voice Agent. It chains speech-to-text â LLM â text-to-speech, all running locally on your GPU via WebGPU. You can have a spoken conversation with an AI without a single network request.
Currently supported models: - TTS: Kokoro 82M, SpeechT5, Piper (VITS) - STT: Whisper Tiny, Whisper Base
Other features: - Side-by-side model comparison - Speed benchmarking on your hardware - Streaming generation for supported models
Source: https://github.com/MbBrainz/ttslab (MIT)
Feedback I'd especially like: 1. How does performance feel on your hardware? 2. What models should I add next? 3. Did the Voice Agent work for you? That's the most experimental part.
Built on top of ONNX Runtime Web (https://onnxruntime.ai) and Transformers.js â huge thanks to those communities for making in-browser ML inference possible.
More from Show
Show HN: Git-native-issue â issues stored as commits in refs/issues/
Show HN: Git-native-issue â issues stored as commits in refs/issues/
Show HN: AgentBudget â Real-time dollar budgets for AI agents
Show HN: AgentBudget â Real-time dollar budgets for AI agents Hey HN,<p>I built AgentBudget after an AI agent loop cost me $187 in 10 minutes â GPT-4o retrying a failed analysis over and over. Existing tools (LangSmith, Langfuse) track costs after execution but don't prevent overspend.<p>AgentBudget is a Python SDK that gives each agent session a hard dollar budget with real-time enforcement. Integration is two lines:<p><pre><code> import agentbudget agentbudget.init("$5.00") </code></pre> It monkey-patches the OpenAI and Anthropic SDKs (same pattern as Sentry/Datadog), so existing code works without changes. When the budget is hit, it raises BudgetExhausted before the next API call goes out.<p>How it works:<p>- Two-phase enforcement: estimates cost pre-call (input tokens + average completion), reconciles post-call with actual usage. Worst-case overshoot is bounded to one call. - Loop detection: sliding window over (tool_name, argument_hash, timestamp) tuples. Catches infinite retries even if budget remains. - Cost engine: pricing table for 50+ models across OpenAI, Anthropic, Google, Mistral, Cohere. Fuzzy matching for dated model variants. - Unified ledger: tracks both LLM calls and external tool costs (via track() or @track_tool decorator) in a single session.<p>Benchmarks: 3.5Ξs median overhead per enforcement check. Zero budget overshoot across all tested scenarios. Loop detection: 0 false positives on diverse workloads, catches pathological loops at exactly N+1 calls.<p>No infrastructure needed â it's a library, not a platform. No Redis, no cloud services, no accounts.<p>I also wrote a whitepaper covering the architecture and integration with Coinbase's x402 payment protocol (where agents make autonomous stablecoin payments): <a href="https://doi.org/10.5281/zenodo.18720464" rel="nofollow">https://doi.org/10.5281/zenodo.18720464</a><p>1,300+ PyPI installs in the first 4 days, all organic. Apache 2.0.<p>Happy to answer questions about the design.
No other tools from this source yet.