🛠️ All DevTools

Showing 1–20 of 4419 tools

Last Updated
May 01, 2026 at 08:00 AM

[CLI Tool] Show HN: Pu.sh – a full coding-agent harness in 400 lines of shell I originally was just messing with pi-autoresearch. Gave it a sample task to build the most portable coding agent.<p>First cut was 6 KB of shell. Great for one-shots, unusable interactively. I was shocked it actually worked.<p>Started building up -- adding features — but with a self-imposed rule: no new dependencies, and sub 500 LOC. This thing had to be truly portable. Just sh, curl, awk. System primitives only.<p>Which means I did some genuinely disgusting things in awk, including JSON parsing and the OpenAI Responses tool loop with reasoning items carried across turns.<p>It&#x27;s now ~400 lines. In the box: Anthropic + OpenAI, 7 tools (bash, read, write, edit, grep, find, ls), REPL, auto-compaction, checkpoint&#x2F;resume, pipe mode, 90 no-API tests. Not in the box: TUI, streaming, images, OAuth, Windows, dignity.<p>Two honest things:<p>1. I stole&#x2F;modified the system prompt and the architecture. Pi&#x2F;Claude&#x2F;Codex wrote the awk. I cannot read most of this code. This wasn&#x27;t possible for me a year ago.<p>2. Heavily inspired by Pi (pi.dev) — same 7-tool surface, same exact-text edit model. Credit where it&#x27;s due. Pi is awesome -- you should probably use them.<p>The agent loop itself is tiny. Almost everything else in a &quot;real&quot; agent CLI is DX and hardening. You can probably build your own harness exactly how you like it. Mario Zechner&#x27;s AI Engineer talk on taking back control of your tools nudged me here.<p>The name is because it&#x27;s a .sh file. The other thing it sounds like is, regrettably, also accurate.

Found: April 30, 2026 ID: 4417

[Other] Kubereboot/Kured: Kubernetes Reboot Daemon

Found: April 30, 2026 ID: 4416

[Other] Durable queues, streams, pub/sub, and a cron scheduler – inside your SQLite file

Found: April 30, 2026 ID: 4414

[Other] Claude Code refuses requests or charges extra if your commits mention "OpenClaw" <a href="https:&#x2F;&#x2F;xcancel.com&#x2F;theo&#x2F;status&#x2F;2049645973350363168" rel="nofollow">https:&#x2F;&#x2F;xcancel.com&#x2F;theo&#x2F;status&#x2F;2049645973350363168</a>

Found: April 30, 2026 ID: 4418

[Other] PostgreSQL 19 features I'm excited about

Found: April 30, 2026 ID: 4413

browserbase/skills

GitHub Trending

[API/SDK] Claude Agent SDK with a web browsing tool

Found: April 30, 2026 ID: 4409

ghostty-org/ghostty

GitHub Trending

[CLI Tool] 👻 Ghostty is a fast, feature-rich, and cross-platform terminal emulator that uses platform-native UI and GPU acceleration.

Found: April 30, 2026 ID: 4408

1jehuang/jcode

GitHub Trending

[Other] Coding Agent Harness

Found: April 30, 2026 ID: 4407

[Other] Mozilla's Opposition to Chrome's Prompt API

Found: April 30, 2026 ID: 4410

[Other] Mozilla's opposition to Chrome's Prompt API

Found: April 30, 2026 ID: 4419

[Other] Warm Burnout: editor and terminal color scheme

Found: April 30, 2026 ID: 4411

[CLI Tool] Show HN: Agent that refuses to run commands without human approval In light of recent news about an agent deleting a production database, I thought now would be a good time to share this.<p>As the use of AI tools in production is becoming more common, sadly so will the high profile incidents like the one mentioned.<p>Fewshell is a terminal agent specifically designed to avoid this.<p>There is no setting to enable command auto-approval. This is by-design, so that the user never has to second-guess or worry about accidentally having it enabled.<p>Originally my intention was to build an AI mobile terminal to make typing shell commands easy. But with so many mobile-enabled &#x27;claw&#x27; agents being available, I decided to make Fewshell the opposite of an autonomous agent.<p>Please star if you like, let me know what you think. Happy to answer questions.<p>About me: I&#x27;m an ex Amazon Sr. SDE for Alexa AI, and currently am working in AI safety research for agentic RLVR. I use this tool to run and check on my lab experiments.

Found: April 30, 2026 ID: 4405

[Other] Claude.ai and API unavailable [fixed] <a href="https:&#x2F;&#x2F;status.claude.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;status.claude.com&#x2F;</a>

Found: April 30, 2026 ID: 4404

[Other] I benchmarked Claude Code's caveman plugin against "be brief."

Found: April 29, 2026 ID: 4403

[Other] Show HN: A Multi User Multi Task Board MCP Server I built a simple multi user, multi board, Task&#x2F;Kanban MCP server. I have been looking for something like this to manage development agents, but I wasn&#x27;t seeing anything that felt like what I wanted. So I set down and decided to vibe code an alternative.<p>While it was an experiment at first I have been using it daily for my personal development projects and I really think there are others who might be looking for exactly this. It&#x27;s 100% a WIP, but it is also very usable.<p>I have a demo instance running at <a href="https:&#x2F;&#x2F;mootasks.dev" rel="nofollow">https:&#x2F;&#x2F;mootasks.dev</a>. If you find this interesting I&#x27;d appreciate a star. This is really the first thing I built that I felt would be of interest to others.<p>The readme explains it, but if you have docker you can get this running in a couple minutes. It&#x27;s helped my workflow a lot and I plan on continuing to add features &#x2F; improve it.

Found: April 29, 2026 ID: 4412

[Other] Show HN: AgentPort – Open-source Security Gateway For Agents Hey HN!<p>I&#x27;ve been wanting to use something like OpenClaw for a while but couldn&#x27;t get myself to give it access to anything important due to all the risks involved. Prompt injection is still a problem (even though some people seem to ignore it) and so are hallucinations and mishaps that cause agents to do things like delete production data [1].<p>Even harnesses like Claude Code and Codex are subject to this, particularly since we&#x27;re getting progressively looser about how we run them e.g. Conductor is really popular and runs agents without any sandboxing.<p>That means we&#x27;re in a bit of an all-or-nothing situation. There are people who just ignore the risks and connect everything to their agents and reap benefits from it while being subject to more risk, and there are others that just don&#x27;t connect anything because they are mindful of the potential issues.<p>I&#x27;ve been quite cautious but have wanted to run more autonomous agents and so I built the component I needed to enable me to do so: AgentPort.<p>AgentPort is a gateway that connects to any service (e.g. Gmail, GitHub, Stripe, PostHog, Linear) and let&#x27;s you set granular permissions for what the agent can do automatically, what it needs your approval for, and what it can never do.<p>For example, you can set `list_customers` and `get_customer` on the Stripe integration to &quot;Auto-approve&quot; but `create_refund` to &quot;Ask for approval&quot;. The agent will thus be able to do a lot in the background independently but when it comes to a potentially destructive operation it will be blocked and receive an approval link to send to you. You can then approve or deny the call with those exact parameters e.g. `create_refund(customer_id: 1234, amount: 12)`.<p>Agents connect via MCP or CLI and have access to all the integrations you connected without ever getting API keys. Kind of like Composio but with granular permissions and open source.<p>The goal with AgentPort is to specifically address two vulnerabilities that agents are subject to:<p>1. Destructive operations on downstream services: It can&#x27;t delete a database unless you explicitly approve 2. Credential exfiltration: Your agent never sees API keys<p>AgentPort also helps with sensitive data exfiltration, but that is more nuanced and complicated to defend against if the agent has an internet connection [2].<p>Ultimately, AgentPort was the missing piece for me to start running more autonomous agents that have access to third-party services, and hopefully it can unlock use cases for you too. There&#x27;s a ton more work needed around securing agents (Claws in particular) and I&#x27;ve both been writing about it [3] and intend to do more in this space, so if you&#x27;re thinking about similar things let&#x27;s have a chat.<p>The repo is <a href="https:&#x2F;&#x2F;github.com&#x2F;yakkomajuri&#x2F;agentport" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;yakkomajuri&#x2F;agentport</a> and you can run it locally with docker compose in a minute or use the one-liner install to deploy a prod instance (domain, TLS, etc.) in just a few mins as well.<p>[1] &quot;An AI agent deleted our production database. The agent&#x27;s confession is below&quot; (<a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=47911524">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=47911524</a>)<p>[2] See my post &quot;On agents dropping production databases&quot;: <a href="https:&#x2F;&#x2F;yakko.dev&#x2F;blog&#x2F;on-agents-dropping-production-dbs" rel="nofollow">https:&#x2F;&#x2F;yakko.dev&#x2F;blog&#x2F;on-agents-dropping-production-dbs</a><p>[3] <a href="https:&#x2F;&#x2F;yakko.dev&#x2F;blog" rel="nofollow">https:&#x2F;&#x2F;yakko.dev&#x2F;blog</a>

Found: April 29, 2026 ID: 4401

[Testing] Show HN: A new benchmark for testing LLMs for deterministic outputs When building workflows that rely on LLMs, we commonly use structured output for programmatic use cases like converting an invoice into rows or meeting transcripts into tickets or even complex PDFs into database entries.<p>The model may return the schema you want, but with hallucinated values like `invoice_date` being off by 2 months or the transcript array ordered wrongly. The JSON is valid, but the values are not.<p>Structured output today is a big part of using LLMs, especially when building deterministic workflows.<p>Current structured output benchmarks (e.g., JSONSchemaBench) only validate the pass rate for JSON schema and types, and not the actual values within the produced JSON.<p>So we designed the Structured Output Benchmark (SOB) that fixes this by measuring both the JSON schema pass rate, types, and the value accuracy across all three modalities, text, image, and audio.<p>For our test set, every record is paired with a JSON Schema and a ground-truth answer that was verified against the source context manually by a human and an LLM cross-check, so a missing or hallucinated value will be considered to be wrong.<p>Open source is doing pretty well with GLM 4.7 coming in number 2 right after GPT 5.4.<p>We noticed the rankings shift across modalities: GLM-4.7 leads text, Gemma-4-31B leads images, Gemini-2.5-Flash leads audio.<p>For example, GPT-5.4 ranks 3rd on text but 9th on images.<p>Model size is not a predictor, either: Qwen3.5-35B and GLM-4.7 beat GPT-5 and Claude-Sonnet-4.6 on Value Accuracy. Phi-4 (14B) beats GPT-5 and GPT-5-mini on text.<p>Structured hallucinations are the hardest bug. Such values are type-correct, schema-valid, and plausible, so they slip through most guardrails. For example, in one audio record, the ground truth is &quot;target_market_age&quot;: &quot;15 to 35 years&quot;, and a model returns &quot;25 to 35&quot;. This is invisible without field-level checks.<p>Our goal is to be the best general model for deterministic tasks, and a key aspect of determinism is a controllable and consistent output structure. The first step to making structured output better is to measure it and hold ourselves against the best.

Found: April 29, 2026 ID: 4399

[Other] I built ten custom subagents to tame a 500K-line Clojure codebase

Found: April 29, 2026 ID: 4396

[Other] Letting AI play my game – building an agentic test harness to help play-testing

Found: April 29, 2026 ID: 4400

[Other] Show HN: Adblock-rust Manager – Firefox extension to enable the Brave ad blocker Firefox 149 ships adblock-rust (Brave&#x27;s Rust engine, MPL-2.0) completely disabled with no UI. It&#x27;s controlled by two about:config prefs with no WebExtension API, so you can&#x27;t touch them programmatically from a standard extension.<p>This extension gives it a UI: ETP toggle (via browser.privacy API, instant), filter list manager with clipboard helpers for the manual about:config steps, and 8 preset lists. You can also add your own if you so desire.

Found: April 29, 2026 ID: 4395
Previous Page 1 of 221 Next