🛠️ All DevTools
Showing 41–60 of 4196 tools
Last Updated
April 17, 2026 at 04:00 AM
jj – the CLI for Jujutsu
Hacker News (score: 337)[CLI Tool] jj – the CLI for Jujutsu
Show HN: A CLI that writes its own integration code
Show HN (score: 5)[CLI Tool] Show HN: A CLI that writes its own integration code We run superglue, an OSS agentic integration platform. Last week I talked to a founder of another YC startup. She found a use case for our CLI that we hadn't officially launched yet.<p>Her problem: customers wanted to create Opps in Salesforce from inside the chat in her app. We kept seeing this pattern: teams build agents and their users can perfectly describe what they want: "pull these three objects from Salesforce and push to nCino when X condition is true", but translating that into a generalized hard-coded tool the agent can call is a lot of work and does not scale since the logic is different for every user.<p>What superglue CLI does: you point it at any API, and your agent gets the ability to reason over that API at runtime. No pre-built tools. The agent reads the spec, plans the calls, executes them.<p>The founder using this in production described it like this: she gave the CLI to her agent with an instruction set and told it not to build tools, just run against the API. It handled multi-step Salesforce object creation correctly, including per-user field logic and record type templates.<p>Concretely: instead of writing a createSalesforceOpp tool that handles contact -> account -> Opp creation with all the conditional logic, you write a skill doc and let the agent figure out which endpoints to hit and in what order.<p>The tradeoff is: you're giving the agent more autonomy over what API calls it makes. That requires good instructions and some guardrails. But for long-tail, user-specific connectors, it's a lot more practical than building a tool for every case.<p>Happy to discuss. Curious if others have run into the "pre-defined tool" ceiling with MCP-based connectors and how you've worked around it.<p>Docs: <a href="https://docs.superglue.cloud/getting-started/cli-skills" rel="nofollow">https://docs.superglue.cloud/getting-started/cli-skills</a> Repo: <a href="https://github.com/superglue-ai/superglue" rel="nofollow">https://github.com/superglue-ai/superglue</a>
Show HN: A stateful UI runtime for reactive web apps in Go
Show HN (score: 5)[Other] Show HN: A stateful UI runtime for reactive web apps in Go Doors: Server-driven UI framework + runtime for building stateful, reactive web applications in Go.<p>Some highlights:<p>* Front-end framework capabilities in server-side Go. Reactive state primitives, dynamic routing, composable components.<p>* No public API layer. No endpoint design needed, private temporal transport is handled under the hood.<p>* Unified control flow. No context switch between back-end/front-end.<p>* Integrated web stack. Bundle assets, build scripts, serve private files, automate CSP, and ship in one binary.<p>How it works: Go server is UI runtime: web application runs on a stateful server, while the browser acts as a remote renderer and input layer.<p>Security model: Every user can interact only with what you render to them. Means you check permissions when your render the button and that's is enough to be sure that related action wont be triggered by anyone else.<p>Mental model: Link DOM to the data it depends on.<p>Limitations:<p>* Does not make sense for static non-iteractive sites, client-first apps with simple routing, and is not suitable for offline PWAs.<p>* Load balancing and roll-outs without user interruption require different strategies with stateful server (mechanics to make it simpler is included).<p>Where it fits best: Apps with heavy user flows and complex business logic. Single execution context and no API/endpoint permission management burden makes it easier.<p>Peculiarities:<p>* Purposely build [Go language extension](<a href="https://github.com/doors-dev/gox" rel="nofollow">https://github.com/doors-dev/gox</a>) with its own LSP, parser, and editor plugins. Adds HTML as Go expressions and \`elem\` primitives.<p>* Custom concurrency engine that enables non-blocking event processing, parallel rendering, and tree-aware state propagation<p>* HTTP/3-ready synchronization protocol (rolling-request + streaming, events via regular post, no WebSockets/SSE)<p>From the author (me): It took me 1 year and 9 month to get to this stage. I rewrote the framework 6 or 7 times until every part is coherent, every decision feels right or is a reasonable compromise. I am very critical to my own work and I see flaws, but overall it turned out solid, I like developer experience as a user. Mental model requires a bit of thinking upfront, but pays off with explicit code and predictable outcome.<p>Code Example:<p><pre><code> type Search struct { input doors.Source[string] // reactive state } elem (s Search) Main() { <input (doors.AInput{ On: func(ctx context.Context, r doors.RequestInput) bool { s.input.Update(ctx, r.Event().Value) // reactive state return false }, }) type="text" placeholder="search"> ~// subscribe results to state changes ~(doors.Sub(s.input, s.results)) } elem (s Search) results(input string) { ~(for _, user := range Users.Search(input) { <card> ~(user.Name) </card> }) }</code></pre>
Show HN: CodeBurn – Analyze Claude Code token usage by task
Hacker News (score: 46)[Other] Show HN: CodeBurn – Analyze Claude Code token usage by task Built this after realizing I was spending ~$1400/week on Claude Code with almost no visibility into what was actually consuming tokens.<p>Tools like ccusage give a cost breakdown per model and per day, but I wanted to understand usage at the task level.<p>CodeBurn reads the JSONL session transcripts that Claude Code stores locally (~/.claude/projects/) and classifies each turn into 13 categories based on tool usage patterns (no LLM calls involved).<p>One surprising result: about 56% of my spend was on conversation turns with no tool usage. Actual coding (edits/writes) was only ~21%.<p>The interface is an interactive terminal UI built with Ink (React for terminals), with gradient bar charts, responsive panels, and keyboard navigation. There’s also a SwiftBar menu bar integration for macOS.<p>Happy to hear feedback or ideas.
Show HN: OQP – A verification protocol for AI agents
Show HN (score: 5)[API/SDK] Show HN: OQP – A verification protocol for AI agents As AI agents autonomously write and deploy code, there's no standard for verifying that what they shipped actually satisfies business requirements. OQP is an attempt to define that standard.<p>It's MCP-compatible and defines four core endpoints: - GET /capabilities — what can this agent verify? - GET /context/workflows — what are the business rules for this workflow? - POST /verification/execute — run a verification workflow - POST /verification/assess-risk — what is the risk of this change?<p>The analogy we keep coming back to: what OpenAPI did for REST APIs, OQP does for agentic software verification.<p>Early contributors include Philip Lew (XBOSoft) and Benjamin Young (W3C JSON-LD Working Group). Looking for feedback from engineers building on top of MCP, agent orchestration frameworks, or anyone who has felt the pain of "the agent shipped something wrong and we had no way to catch it."<p>Repo: github.com/OranproAi/open-qa-protocol
N-Day-Bench – Can LLMs find real vulnerabilities in real codebases?
Hacker News (score: 24)[Testing] N-Day-Bench – Can LLMs find real vulnerabilities in real codebases? N-Day-Bench tests whether frontier LLMs can find known security vulnerabilities in real repository code. Each month it pulls fresh cases from GitHub security advisories, checks out the repo at the last commit before the patch, and gives models a sandboxed bash shell to explore the codebase.<p>Static vulnerability discovery benchmarks become outdated quickly. Cases leak into training data, and scores start measuring memorization. The monthly refresh keeps the test set ahead of contamination — or at least makes the contamination window honest.<p>Each case runs three agents: a Curator reads the advisory and builds an answer key, a Finder (the model under test) gets 24 shell steps to explore the code and write a structured report, and a Judge scores the blinded submission. The Finder never sees the patch. It starts from sink hints and must trace the bug through actual code.<p>Only repos with 10k+ stars qualify. A diversity pass prevents any single repo from dominating the set. Ambiguous advisories (merge commits, multi-repo references, unresolvable refs) are dropped.<p>Currently evaluating GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro, GLM-5.1, and Kimi K2.5. All traces are public.<p>Methodology: <a href="https://ndaybench.winfunc.com/methodology">https://ndaybench.winfunc.com/methodology</a><p>Live Leaderboard: <a href="https://ndaybench.winfunc.com/leaderboard">https://ndaybench.winfunc.com/leaderboard</a><p>Live Traces: <a href="https://ndaybench.winfunc.com/traces">https://ndaybench.winfunc.com/traces</a>
GitHub Stacked PRs
Hacker News (score: 349)[Other] GitHub Stacked PRs
Show HN: Lythonic – Compose Python functions into data-flow pipelines
Show HN (score: 5)[Other] Show HN: Lythonic – Compose Python functions into data-flow pipelines I was thinking about something like this for years, few trys before this. Started this repo last year and I think I got something that usable now.<p>Async framework, mix sync/async python functions, compose them into DAGs, run them, schedule them, persist data between steps or let it flow just in memory.<p>GitHub: <a href="https://github.com/walnutgeek/lythonic" rel="nofollow">https://github.com/walnutgeek/lythonic</a><p>Docs: <a href="https://walnutgeek.github.io/lythonic/" rel="nofollow">https://walnutgeek.github.io/lythonic/</a><p>PyPI: pip install lythonic<p>It is dataflow. So theoretically you can compose it with pure functions only. Lythonic requires annotations for params and returns to wire up outputs with inputs. All data saved in sqlite as json for now, and it would work for some amount of data ok.<p>You may use it as task flow keeping params and returns empty and maintaining all data outside of the flow.<p>But practically you may do well with some middle ground, just flow metadata thru, enough to make your function calls reproducible and keep some system of records that you can query reliably.<p>Anyway I will stop rambling ... soon.<p>Python 3.11+ MIT License. Minimal dependencies: Pydantic, Pyyaml, Croniter<p>Prepping for v0.1. Looking of feedback. v0.0.14 is out. Claude generated reasonable docs. Sorry, I would not be able to do it better. I am working on Web UI and practical E2E example app as well.<p>Thank you. -Sergey
GAIA – Open-source framework for building AI agents that run on local hardware
Hacker News (score: 126)[Other] GAIA – Open-source framework for building AI agents that run on local hardware
Show HN: Ithihāsas – a character explorer for Hindu epics, built in a few hours
Hacker News (score: 126)[Other] Show HN: Ithihāsas – a character explorer for Hindu epics, built in a few hours Hi HN!<p>I’ve always found it hard to explore the Mahābhārata and Rāmāyaṇa online. Most content is either long-form or scattered, and understanding a character like Karna or Bhishma usually means opening multiple tabs.<p>I built <a href="https://www.ithihasas.in/" rel="nofollow">https://www.ithihasas.in/</a> to solve that. It is a simple character explorer that lets you navigate the epics through people and their relationships instead of reading everything linearly.<p>This was also an experiment with Claude CLI. I was able to put together the first version in a couple of hours. It helped a lot with generating structured content and speeding up development, but UX and data consistency still needed manual work.<p>Would love feedback on the UX and whether this way of exploring mythology works for you.
How to make Firefox builds 17% faster
Hacker News (score: 29)[Other] How to make Firefox builds 17% faster
Show HN: Mcptube – Karpathy's LLM Wiki idea applied to YouTube videos
Show HN (score: 10)[Other] Show HN: Mcptube – Karpathy's LLM Wiki idea applied to YouTube videos I watch a lot of Stanford/Berkeley lectures and YouTube content on AI agents, MCP, and security. Got tired of scrubbing through hour-long videos to find one explanation. Built v1 of mcptube a few months ago. It performs transcript search and implements Q&A as an MCP server. It got traction (34 stars, my first open-source PR, some notable stargazers like CEO of Trail of Bits).<p>But v1 re-searched raw chunks from scratch every query. So I rebuilt it.<p>v2 (mcptube-vision) follows Karpathy's LLM Wiki pattern. At ingest time, it extracts transcripts, detects scene changes with ffmpeg, describes key frames via a vision model, and writes structured wiki pages. Knowledge compounds across videos rather than being re-discovered. FTS5 + a two-stage agent (narrow then reason) for retrieval.<p>MCPTube works both as CLI (BYOK) and MCP server. I tested MCPTube with Claude Code, Claude Desktop, VS Code Copilot, Cursor, and others. Zero API key needed server-side.<p>Coming soon: I am also building SaaS platform. This platform supports playlist ingestion, team wikis, etc. I like to share early access signup: <a href="https://0xchamin.github.io/mcptube/" rel="nofollow">https://0xchamin.github.io/mcptube/</a><p>Happy to discuss architecture tradeoffs — FTS5 vs vectors, file-based wiki vs DB, scene-change vs fixed-interval sampling. Give it a try via `pip install mcptube`. Also, please do star the repo if you enjoy my contribution (<a href="https://github.com/0xchamin/mcptube" rel="nofollow">https://github.com/0xchamin/mcptube</a>)
Show HN: Dbg – One CLI debugger for every language (AI-agent ready)
Show HN (score: 5)[CLI Tool] Show HN: Dbg – One CLI debugger for every language (AI-agent ready) AI agents are great at writing code but blind at runtime. They guess, print, and waste tokens.<p>I built dbg to give them a real debugger experience. Since it is backend based with the few I implemented (still at basic level) it can support 15+ languages with one simple CLI (still some work needed but it is functional as it is):<p>LLDB, Delve, PDB, JDB, node inspect, rdbg, phpdbg, GHCi, etc. Profilers too (perf, pprof, cProfile, Valgrind…)<p>I also added GPU profiling via `gdbg` (CUDA, PyTorch, Triton kernels). It auto-dispatches and shares the same unified interface. (Planning to bring those advanced concepts back to the main dbg).<p>Works with Claude & Codex (probably works on others but didn't try them)<p>Quick start: ``` curl -sSf <a href="https://raw.githubusercontent.com/redknightlois/dbg/main/install.sh" rel="nofollow">https://raw.githubusercontent.com/redknightlois/dbg/main/ins...</a> | sh dbg --init claude (for claude) ```<p>Then just say: “use dbg to debug the crash in src/foo.rs”<p>Docs: <a href="https://redknightlois.github.io/dbg/" rel="nofollow">https://redknightlois.github.io/dbg/</a> GitHub (MIT Licensed): <a href="https://github.com/redknightlois/dbg" rel="nofollow">https://github.com/redknightlois/dbg</a><p>Would love feedback from anyone building agents. What languages or features are you missing most? Ping me at @federicolois on X or open issues.
Building a CLI for All of Cloudflare
Hacker News (score: 182)[CLI Tool] Building a CLI for All of Cloudflare
Initial mainline video capture and camera support for Rockchip RK3588
Hacker News (score: 19)[Other] Initial mainline video capture and camera support for Rockchip RK3588
Michigan 'digital age' bills pulled after privacy concerns raised
Hacker News (score: 180)[Other] Michigan 'digital age' bills pulled after privacy concerns raised
Show HN: I built a social media management tool in 3 weeks with Claude and Codex
Hacker News (score: 148)[Other] Show HN: I built a social media management tool in 3 weeks with Claude and Codex
Show HN: Equirect – a Rust VR video player
Show HN (score: 7)[Other] Show HN: Equirect – a Rust VR video player This is almost entirely created by Claude, not me. I know some people aren't into that. I was one of them 3 months ago. Since the beginning of the year I finally started getting more serious about trying out AI. The company I work for also had an AI week with lots of training. All I can say is I'm pretty blown away. My entire life feels like it changed over the last month from someone who mostly writes code to mostly someone that prompts AI to write code. And just for a tiny bit of context, I'm 60yrs old and have been coding since 1980.<p>I get all the concerns, and I review all AI code at work and most AI code for personal projects. This one in particular though, not so much. I get that's frowned on but this is a small, limited scope, personal project. Not that I didn't pay attention, Claude did do some things in strange ways and I asked it to fix them quite often. But, conversely, I have zero rust experience, zero OpenXR experience, zero wgpu expericence, next to zero relevant Windows experience.<p>I'm guessing I spent about ~30 hours in total prompting Claude for each step. I started with "make a windows app that opens a window". Then I had it add wgpu and draw hello triangles. Then I had it add OpenXR and draw those triangles in VR. That actually took it some time as it tried to figure out how to connect a wgpu texture to the surface being drawn in OpenXR. It figured it out though, far far faster than I would have. I'd have tried to find a working example or given up.<p>I then sat on that for about a month and finally got back to it this weekend and zoomed through getting Claude to make it work. The only parts I did was make some programmer art icons.<p>I can post the prompts in the repo if anyone is interested, and assming I can find them.<p>Also in the last 2 weeks I've resurrected an old project that bit-rot. Claude got it all up to date, and fixed a bunch of bugs, and checked off a bunch of features I'd always wanted to add. I also had Claude write 2 libraries, a zip library, an rar decompression library, as well as refactor an existing zip decompression library to use some modern features. It's been really fun! For those I read the code much more than I did for this one. Still, "what I time to be alive"!
Show HN: Rekal – Long-term memory for LLMs in a single SQLite file
Show HN (score: 7)[Database] Show HN: Rekal – Long-term memory for LLMs in a single SQLite file I got tired of repeating myself to my LLM every session. rekal is an MCP server that stores memories in SQLite and retrieves them with hybrid search (BM25 + vectors + recency decay). One file, local embeddings, no API keys.
I ran Gemma 4 as a local model in Codex CLI
Hacker News (score: 18)[CLI Tool] I ran Gemma 4 as a local model in Codex CLI