🛠️ All DevTools

Showing 1–20 of 3722 tools

Last Updated
March 11, 2026 at 04:05 PM

[DevOps] Show HN: Klaus – OpenClaw on a VM, batteries included We are Bailey and Robbie and we are working on Klaus (<a href="https:&#x2F;&#x2F;klausai.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;klausai.com&#x2F;</a>): hosted OpenClaw that is secure and powerful out of the box.<p>Running OpenClaw requires setting up a cloud VM or local container (a pain) or giving OpenClaw root access to your machine (insecure). Many basic integrations (eg Slack, Google Workspace) require you to create your own OAuth app.<p>We make running OpenClaw simple by giving each user their own EC2 instance, preconfigured with keys for OpenRouter, AgentMail, and Orthogonal. And we have OAuth apps to make it easy to integrate with Slack and Google Workspace.<p>We are both HN readers (Bailey has been on here for ~10 years) and we know OpenClaw has serious security concerns. We do a lot to make our users’ instances more secure: we run on a private subnet, automatically update the OpenClaw version our users run, and because you’re on our VM by default the only keys you leak if you get hacked belong to us. Connecting your email is still a risk. The best defense I know of is Opus 4.6 for resilience to prompt injection. If you have a better solution, we’d love to hear it!<p>We learned a lot about infrastructure management in the past month. Kimi K2.5 and Mimimax M2.5 are extremely good at hallucinating new ways to break openclaw.json and otherwise wreaking havoc on an EC2 instance. The week after our launch we spent 20+ hours fixing broken machines by hand.<p>We wrote a ton of best practices on using OpenClaw on AWS Linux into our users’ AGENTS.md, got really good at un-bricking EC2 machines over SSM, added a command-and-control server to every instance to facilitate hotfixes and migrations, and set up a Klaus instance to answer FAQs on discord.<p>In addition to all of this, we built ClawBert, our AI SRE for hotfixing OpenClaw instances automatically: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=v65F6VBXqKY" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=v65F6VBXqKY</a>. Clawbert is a Claude Code instance that runs whenever a health check fails or the user triggers it in the UI. It can read that user’s entries in our database and execute commands on the user’s instance. We expose a log of Clawbert’s runs to the user.<p>We know that setting up OpenClaw is easy for most HN readers, but I promise it is not for most people. Klaus has a long way to go, but it’s still very rewarding to see people who’ve never used Claude Code get their first taste of AI agents.<p>We charge $19&#x2F;m for a t4g.small, $49&#x2F;m for a t4g.medium, and $200&#x2F;m for a t4g.xlarge and priority support. You get $15 in tokens and $20 in Orthogonal credits one-time.<p>We want to know what you are building on OpenClaw so we can make sure we support it. We are already working with companies like Orthogonal and Openrouter that are building things to make agents more useful, and we’re sure there are more tools out there we don’t know about. If you’ve built something agents want, please let us know. Comments welcome!

Found: March 11, 2026 ID: 3721

[Other] Show HN: Open-source browser for AI agents Hi HN, I forked chromium and built agent-browser-protocol (ABP) after noticing that most browser-agent failures aren’t really about the model misunderstanding the page. Instead, the problem is that the model is reasoning from a stale state.<p>ABP is designed to keep the acting agent synchronized with the browser at every step. After each action (click, type, etc), it freezes JavaScript execution and rendering, then captures the resulting state. It also compiles the notable events that occurred during that action loop, such as navigation, file pickers, permission prompts, alerts, and downloads, and sends that along with a screenshot of the frozen page state back to the agent.<p>The result is that browser interaction starts to feel more like a multimodal chat loop. The agent takes an action, gets back a fresh visual state and a structured summary of what happened, then decides what to do next from there. That fits much better with how LLMs already work.<p>A few common browser-use failures ABP helps eliminate: * A modal appears after the last Playwright screenshot and blocks the input the agent was about to use * Dynamic filters cause the page to reflow between steps * An autocomplete dropdown opens and covers the element the agent intended to click * alert() &#x2F; confirm() interrupts the flow * Downloads are triggered, but the agent has no reliable way to know when they’ve completed<p>As proof, ABP with opus 4.6 as the driver scores 90.5% on the Online Mind2Web benchmark. I think modern LLMs already understand websites, they just need a better tool to interact with them. Happy to answer questions about the architecture, forking chrome or anything else in the comments below.<p>Try it out: `claude mcp add browser -- npx -y agent-browser-protocol --mcp` (Codex&#x2F;OpenCode instructions in the docs)<p>Demo video: <a href="https:&#x2F;&#x2F;www.loom.com&#x2F;share&#x2F;387f6349196f417d8b4b16a5452c3369" rel="nofollow">https:&#x2F;&#x2F;www.loom.com&#x2F;share&#x2F;387f6349196f417d8b4b16a5452c3369</a>

Found: March 11, 2026 ID: 3722

[Other] Show HN: Modulus – Cross-repository knowledge orchestration for coding agents Hello HN, we&#x27;re Jeet and Husain from Modulus (<a href="https:&#x2F;&#x2F;modulus.so" rel="nofollow">https:&#x2F;&#x2F;modulus.so</a>) - a desktop app that lets you run multiple coding agents with shared project memory. We built it to solve two problems we kept running into:<p>- Cross-repo context is broken. When working across multiple repositories, agents don&#x27;t understand dependencies between them. Even if we open two repos in separate Cursor windows, we still have to manually explain the backend API schema while making changes in the frontend repo.<p>- Agents lose context. Switching between coding agents often means losing context and repeating the same instructions again.<p>Modulus shares memory across agents and repositories so they can understand your entire system.<p>It&#x27;s an alternative to tools like Conductor for orchestrating AI coding agents to build product, but we focused specifically on multi-repo workflows (e.g., backend repo + client repo + shared library repo + AI agents repo). We built our own Memory and Context Engine from the ground up specifically for coding agents.<p>Why build another agent orchestration tool? It came from our own problem. While working on our last startup, Husain and I were working across two different repositories. Working across repos meant manually pasting API schemas between Cursor windows — telling the frontend agent what the backend API looked like again and again. So we built a small context engine to share knowledge across repos and hooked it up to Cursor via MCP. This later became Modulus.<p>Soon, Modulus will allow teams to share knowledge with others to improve their workflows with AI coding agents - enabling team collaboration in the era of AI coding. Our API will allow developers to switch between coding agents or IDEs without losing any context.<p>If you wanna see a quick demo before trying out, here is our launch post - <a href="https:&#x2F;&#x2F;x.com&#x2F;subhajitsh&#x2F;status&#x2F;2024202076293841208" rel="nofollow">https:&#x2F;&#x2F;x.com&#x2F;subhajitsh&#x2F;status&#x2F;2024202076293841208</a><p>We&#x27;d greatly appreciate any feedback you have and hope you get the chance to try out Modulus.

Found: March 10, 2026 ID: 3712

[Other] Launch HN: RunAnywhere (YC W26) – Faster AI Inference on Apple Silicon Hi HN, we&#x27;re Sanchit and Shubham (YC W26). We built a fast inference engine for Apple Silicon. LLMs, speech-to-text, text-to-speech – MetalRT beats llama.cpp, Apple&#x27;s MLX, Ollama, and sherpa-onnx on every modality we tested. Custom Metal shaders, no framework overhead.<p>Also, we&#x27;ve open-sourced RCLI, the fastest end-to-end voice AI pipeline on Apple Silicon. Mic to spoken response, entirely on-device. No cloud, no API keys.<p>To get started:<p><pre><code> brew tap RunanywhereAI&#x2F;rcli https:&#x2F;&#x2F;github.com&#x2F;RunanywhereAI&#x2F;RCLI.git brew install rcli rcli setup # downloads ~1 GB of models rcli # interactive mode with push-to-talk </code></pre> Or:<p><pre><code> curl -fsSL https:&#x2F;&#x2F;raw.githubusercontent.com&#x2F;RunanywhereAI&#x2F;RCLI&#x2F;main&#x2F;install.sh | bash </code></pre> The numbers (M4 Max, 64 GB, reproducible via `rcli bench`):<p>LLM decode – 1.67x faster than llama.cpp, 1.19x faster than Apple MLX (same model files): - Qwen3-0.6B: 658 tok&#x2F;s (vs mlx-lm 552, llama.cpp 295) - Qwen3-4B: 186 tok&#x2F;s (vs mlx-lm 170, llama.cpp 87) - LFM2.5-1.2B: 570 tok&#x2F;s (vs mlx-lm 509, llama.cpp 372) - Time-to-first-token: 6.6 ms<p>STT – 70 seconds of audio transcribed in *101 ms*. That&#x27;s 714x real-time. 4.6x faster than mlx-whisper.<p>TTS – 178 ms synthesis. 2.8x faster than mlx-audio and sherpa-onnx.<p>We built this because demoing on-device AI is easy but shipping it is brutal. Voice is the hardest test: you&#x27;re chaining STT, LLM, and TTS sequentially, and if any stage is slow, the user feels it. Most teams fall back to cloud APIs not because local models are bad, but because local inference infrastructure is.<p>The thing that&#x27;s hard to solve is latency compounding. In a voice pipeline, you&#x27;re stacking three models in sequence. If each adds 200ms, you&#x27;re at 600ms before the user hears a word, and that feels broken. You can&#x27;t optimize one stage and call it done. Every stage needs to be fast, on one device, with no network round-trip to hide behind.<p>We went straight to Metal. Custom GPU compute shaders, all memory pre-allocated at init (zero allocations during inference), and one unified engine for all three modalities instead of stitching separate runtimes together.<p>MetalRT is the first engine to handle all three modalities natively on Apple Silicon. Full methodology:<p>LLM benchmarks: <a href="https:&#x2F;&#x2F;www.runanywhere.ai&#x2F;blog&#x2F;metalrt-fastest-llm-decode-engine-apple-silicon">https:&#x2F;&#x2F;www.runanywhere.ai&#x2F;blog&#x2F;metalrt-fastest-llm-decode-e...</a><p>Speech benchmarks: <a href="https:&#x2F;&#x2F;www.runanywhere.ai&#x2F;blog&#x2F;metalrt-speech-fastest-stt-tts-apple-silicon">https:&#x2F;&#x2F;www.runanywhere.ai&#x2F;blog&#x2F;metalrt-speech-fastest-stt-t...</a><p>How: Most inference engines add layers between you and the GPU: graph schedulers, runtime dispatchers, memory managers. MetalRT skips all of it. Custom Metal compute shaders for quantized matmul, attention, and activation - compiled ahead of time, dispatched directly.<p>Voice Pipeline optimizations details: <a href="https:&#x2F;&#x2F;www.runanywhere.ai&#x2F;blog&#x2F;fastvoice-on-device-voice-ai-pipeline-apple-silicon">https:&#x2F;&#x2F;www.runanywhere.ai&#x2F;blog&#x2F;fastvoice-on-device-voice-ai...</a> RAG optimizations: <a href="https:&#x2F;&#x2F;www.runanywhere.ai&#x2F;blog&#x2F;fastvoice-rag-on-device-retrieval-augmented-voice-ai">https:&#x2F;&#x2F;www.runanywhere.ai&#x2F;blog&#x2F;fastvoice-rag-on-device-retr...</a><p>RCLI is the open-source voice pipeline (MIT) built on MetalRT: three concurrent threads with lock-free ring buffers, double-buffered TTS, 38 macOS actions by voice, local RAG (~4 ms over 5K+ chunks), 20 hot-swappable models, and a full-screen TUI with per-op latency readouts. Falls back to llama.cpp when MetalRT isn&#x27;t installed.<p>Source: <a href="https:&#x2F;&#x2F;github.com&#x2F;RunanywhereAI&#x2F;RCLI" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;RunanywhereAI&#x2F;RCLI</a> (MIT)<p>Demo: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=eTYwkgNoaKg" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=eTYwkgNoaKg</a><p>What would you build if on-device AI were genuinely as fast as cloud?

Found: March 10, 2026 ID: 3707

[Other] Show HN: Agentic Data Analysis with Claude Code Hey HN, as a former data analyst, I’ve been tooling around trying to get agents to do my old job. The result is this system that gets you maybe 80% of the way there. I think this is a good data point for what the current frontier models are capable of and where they are still lacking (in this case — hypothesis generation and general data intuition).<p>Some initial learnings: - Generating web app-based reports goes much better if there are explicit templates&#x2F;pre-defined components for the model to use. - Claude can “heal” broken charts if you give it access to chart images and run a separate QA loop.<p>Would either feedback from the community or to hear from others that have tried similar things!

Found: March 10, 2026 ID: 3719

[Other] I built a programming language using Claude Code

Found: March 10, 2026 ID: 3708

[Other] Show HN: A modern React onboarding tour library react-tourlight is the modern React tour library. Zero dependencies, WCAG 2.1 AA accessible, under 5 kB gzipped. The one that works with React 19.

Found: March 10, 2026 ID: 3713

sepinf-inc/IPED

GitHub Trending

[Other] IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.

Found: March 10, 2026 ID: 3702

[Other] Show HN: Ash, an Agent Sandbox for Mac Ash is a macOS sandbox that restricts AI coding agents. It limits access to files, networks, processes, IO devices, and environment variables. You can use Ash with any CLI coding agent by wrapping it in a single command: `ash run -- &lt;agent&gt;`. I typically use it with Claude to stay safe while avoiding repetitive prompts: `ash run -- claude --dangerously-skip-permissions`.<p>Ash restricts resources via the Endpoint Security and Network Extension frameworks. These frameworks are significantly more powerful than the sandbox-exec tool.<p>Each session is driven by a policy file. Any out-of-policy action is denied by default. You can audit denials in the GUI app, which lets you view out-of-policy actions and retroactively add them to your policy file.<p>Ash also comes with tools for building policies. You can use an &quot;observation session&quot; to watch the typical behavior of a coding agent and capture that behavior in a policy file for future sandbox sessions. Linting, formatting, and rule merging are all built into the Ash CLI to keep your policy files concise and maintainable.<p>Download Ash at <a href="https:&#x2F;&#x2F;ashell.dev" rel="nofollow">https:&#x2F;&#x2F;ashell.dev</a>

Found: March 10, 2026 ID: 3717

Rebasing in Magit

Hacker News (score: 91)

[Other] Rebasing in Magit

Found: March 10, 2026 ID: 3703

[Other] Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs I found that duplicating a specific block of 7 middle layers in Qwen2-72B, without modifying any weights, improved performance across all Open LLM Leaderboard benchmarks and took #1. As of 2026, the top 4 models on that leaderboard are still descendants.<p>The weird finding: single-layer duplication does nothing. Too few layers, nothing. Too many, it gets worse. Only circuit-sized blocks of ~7 layers work. This suggests pretraining carves out discrete functional circuits in the layer stack that only work when preserved whole.<p>The whole thing was developed on 2x RTX 4090s in my basement. I&#x27;m now running current models (GLM-4.7, Qwen3.5, MiniMax M2.5) on a dual GH200 rig (see my other post). Code and new models coming soon.<p>Happy to answer questions.

Found: March 10, 2026 ID: 3715

[CLI Tool] Show HN: Smux – Terminal Multiplexer built for AI agents

Found: March 10, 2026 ID: 3710

[Other] Show HN: DD Photos – open-source photo album site generator (Go and SvelteKit) I was frustrated with photo sharing sites. Apple&#x27;s iCloud shared albums take 20+ seconds to load, and everything else comes with ads, cumbersome UIs, or social media distractions. I just want to share photos with friends and family: fast, mobile-friendly, distraction-free.<p>So I built DD Photos. You export photos from whatever you already use (Lightroom, Apple Photos, etc.) into folders, run `photogen` (a Go CLI) to resize them to WebP and generate JSON indexes, then deploy the SvelteKit static site anywhere that serves files. Apache, S3, whatever. No server-side code, no database.<p>Built over several weeks with heavy use of Claude Code, which I found genuinely useful for this kind of full-stack project spanning Go, SvelteKit&#x2F;TypeScript, Apache config, Docker, and Playwright tests. Happy to discuss that experience too.<p>Live example: <a href="https:&#x2F;&#x2F;photos.donohoe.info" rel="nofollow">https:&#x2F;&#x2F;photos.donohoe.info</a> Repo: <a href="https:&#x2F;&#x2F;github.com&#x2F;dougdonohoe&#x2F;ddphotos" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;dougdonohoe&#x2F;ddphotos</a>

Found: March 10, 2026 ID: 3709

[Other] Show HN: Local-first firmware analyzer using WebAssembly Hi HN,<p>I just wanted to share what I have been working on for the past few months: A firmware analyzer for embedded Linux systems that helps uncovering security issues running entirely in the browser.<p>This is a very early Alpha. It is going to be rough around the edges. But I think it provides quite a lot of value already.<p>So please go ahead and drop a firmware (only .tar rootfs archives for now) and try to break it :)

Found: March 10, 2026 ID: 3706

promptfoo/promptfoo

GitHub Trending

[Testing] Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

Found: March 10, 2026 ID: 3700

[Other] Removing recursion via explicit callstack simulation

Found: March 10, 2026 ID: 3716

[Other] Claude Code, Claude Cowork and Codex #5

Found: March 10, 2026 ID: 3698

[Other] Show HN: I Was Here – Draw on street view, others can find your drawings Hey HN, I made a site where you can draw on street-level panoramas. Your drawings persist and other people can see them in real time.<p>Strokes get projected onto the 3D panorama so they wrap around buildings and follow the geometry, not just a flat overlay. Uses WebGL2 for rendering, Mapillary for the street imagery.<p>The idea is for it to become a global canvas, anyone can leave a mark anywhere and others stumble onto it.

Found: March 10, 2026 ID: 3701

[Other] The Cost of 'Lightweight' Frameworks: From Tauri to Native Rust

Found: March 09, 2026 ID: 3696

[Other] Oracle is building yesterday's data centers with tomorrow's debt

Found: March 09, 2026 ID: 3695
Previous Page 1 of 187 Next