🛠️ Hacker News Tools
Showing 441–460 of 2466 tools from Hacker News
Last Updated
April 21, 2026 at 12:00 PM
The wild six weeks for NanoClaw's creator that led to a deal with Docker
Hacker News (score: 27)[Other] The wild six weeks for NanoClaw's creator that led to a deal with Docker
Mouser: An open source alternative to Logi-Plus mouse software
Hacker News (score: 281)[Other] Mouser: An open source alternative to Logi-Plus mouse software I discovered this project because all-of-a-sudden Logi Options Plus software updater started taking 40-60% of my Intel Macbook Pro until I killed the process (of course it restarts). In my searches I ended up at a reddit discussion where I found other people with same issues.<p>I'm a minor contributor to this project but it aims to reduce/eliminate the need to use Logitech proprietary software and telemetry. We could use help if other people are interested.<p>Please check out the github link for more detailed motivations (eliminating telemetry) as a part of this project. Here is link: <a href="https://github.com/TomBadash/MouseControl" rel="nofollow">https://github.com/TomBadash/MouseControl</a>
Show HN: AgentLog – a lightweight event bus for AI agents using JSONL logs
Show HN (score: 6)[Other] Show HN: AgentLog – a lightweight event bus for AI agents using JSONL logs I’ve been experimenting with infrastructure for multi-agent systems.<p>I built a small project called AgentLog.<p>The core idea is very simple, topics are just append-only JSONL files.<p>Agents publish events over HTTP and subscribe to streams using SSE.<p>The system is intentionally single-node and minimal for now.<p>Future ideas I’m exploring: - replayable agent workflows - tracing reasoning across agents - visualizing event timelines - distributed/federated agent logs<p>Curious if others building agent systems have run into similar needs.
Pandas Exercises for Data Analysis (Interactive)
Hacker News (score: 98)[Other] Pandas Exercises for Data Analysis (Interactive)
Show HN: Context Gateway – Compress agent context before it hits the LLM
Hacker News (score: 32)[Other] Show HN: Context Gateway – Compress agent context before it hits the LLM We built an open-source proxy that sits between coding agents (Claude Code, OpenClaw, etc.) and the LLM, compressing tool outputs before they enter the context window.<p>Demo: <a href="https://www.youtube.com/watch?v=-vFZ6MPrwjw#t=9s" rel="nofollow">https://www.youtube.com/watch?v=-vFZ6MPrwjw#t=9s</a>.<p>Motivation: Agents are terrible at managing context. A single file read or grep can dump thousands of tokens into the window, most of it noise. This isn't just expensive — it actively degrades quality. Long-context benchmarks consistently show steep accuracy drops as context grows (OpenAI's GPT-5.4 eval goes from 97.2% at 32k to 36.6% at 1M <a href="https://openai.com/index/introducing-gpt-5-4/" rel="nofollow">https://openai.com/index/introducing-gpt-5-4/</a>).<p>Our solution uses small language models (SLMs): we look at model internals and train classifiers to detect which parts of the context carry the most signal. When a tool returns output, we compress it conditioned on the intent of the tool call—so if the agent called grep looking for error handling patterns, the SLM keeps the relevant matches and strips the rest.<p>If the model later needs something we removed, it calls expand() to fetch the original output. We also do background compaction at 85% window capacity and lazy-load tool descriptions so the model only sees tools relevant to the current step.<p>The proxy also gives you spending caps, a dashboard for tracking running and past sessions, and Slack pings when an agent is sitting there waiting on you.<p>Repo is here: <a href="https://github.com/Compresr-ai/Context-Gateway" rel="nofollow">https://github.com/Compresr-ai/Context-Gateway</a>. You can try it with:<p><pre><code> curl -fsSL https://compresr.ai/api/install | sh </code></pre> Happy to go deep on any of it: the compression model, how the lazy tool loading works, or anything else about the gateway. Try it out and let us know how you like it!
Pyodide: a Python distribution based on WebAssembly
Hacker News (score: 12)[Other] Pyodide: a Python distribution based on WebAssembly
Launch HN: Captain (YC W26) – Automated RAG for Files
Hacker News (score: 38)[Other] Launch HN: Captain (YC W26) – Automated RAG for Files Hi HN, we’re Lewis and Edgar, building Captain to simplify unstructured data search (<a href="https://runcaptain.com">https://runcaptain.com</a>). Captain automates the building and maintenance of file-based RAG pipelines. It indexes cloud storage like S3 and GCS, plus SaaS sources like Google Drive. There’s a quick walkthrough at <a href="https://youtu.be/EIQkwAsIPmc" rel="nofollow">https://youtu.be/EIQkwAsIPmc</a>.<p>We also put up this demo site called “Ask PG’s Essays” which lets you ask/search the corpus of pg’s essays, to get a feel for how it works: <a href="https://pg.runcaptain.com">https://pg.runcaptain.com</a>. The RAG part of this took Captain about 3 minutes to set up.<p>Here are some sample prompts to get a feel for the experience:<p>“When do we do things that don't scale? When should we be more cautious?” <a href="https://pg.runcaptain.com/?q=When%20do%20we%20do%20things%20that%20don't%20scale%3F%20When%20should%20we%20be%20more%20cautious%3F">https://pg.runcaptain.com/?q=When%20do%20we%20do%20things%20...</a><p>“Give me some advice, I'm fundraising” <a href="https://pg.runcaptain.com/?q=Give%20me%20some%20advice%2C%20I'm%20fundraising">https://pg.runcaptain.com/?q=Give%20me%20some%20advice%2C%20...</a><p>“What are the biggest advantages of Lisp” <a href="https://pg.runcaptain.com/?q=what%20are%20the%20biggest%20advantages%20of%20Lisp">https://pg.runcaptain.com/?q=what%20are%20the%20biggest%20ad...</a><p>A good production RAG pipeline takes substantial effort to build, especially for file workloads. You have to handle ETL or text extraction, chunking, embedding, storage, search, re-ranking, inference, and often compliance and observability – all while optimizing for latency and reliability. It’s a lot to manage. grep works well in some cases, but for agents, semantic search provides significantly higher performance. Cursor uses both and reports 6.5%–23.5% accuracy gains from vector search over grep (<a href="https://cursor.com/blog/semsearch" rel="nofollow">https://cursor.com/blog/semsearch</a>).<p>We’ve spent the past four years scaling RAG pipelines for companies, and Edgar’s work at Purdue’s NLP lab directly informed our chunking techniques. In conversations with dozens of engineers, we repeatedly saw DIY pipelines produce inconsistent results, even after weeks of tuning. Many teams lacked clarity on which retrieval strategies best fit their data.<p>We realized that a system to provision storage and embeddings, handle indexing, and continuously update pipelines to reflect the latest search techniques could remove the need for every team to rebuild RAG themselves. That idea became Captain.<p>In practice, one API call indexes URLs, cloud storage buckets, directories, or individual files. Under the hood, we’re converting everything to Markdown. For this, we’ve had good results with Gemini 3 Pro for images, Reducto for complex documents, and Extend for basic OCR. For embedding models, ‘gemini-embedding-001’ performed reasonably well at first, but we later switched to the Contextualized Embeddings from ‘voyage-context-3’. It produced more relevant results than even the newer Voyage 4 models because its chunk embeddings are encoded with awareness of the surrounding document context. We then applied Voyage’s ‘rerank-2.5’ as second-stage re-ranking, reducing 50 initial chunks to a final top 15 (configurable in Captain’s API). Dense embeddings are just half the picture and full-text search with RRF complete our hybrid retrieval. In the Captain API, these techniques are exposed through a single /query endpoint. Access controls can be configured via metadata filters, and page number citations are returned automatically.<p>The stack is constantly changing but the Captain API creates a standard interface for this. You can try Captain, 1 month for free, and build your own pipelines at <a href="https://runcaptain.com">https://runcaptain.com</a>. We’re looking for candid feedback, especially anything that can make it more useful, and look forward to your comments!
Show HN: Mesa – A collaborative canvas IDE built for agent-first development
Show HN (score: 14)[IDE/Editor] Show HN: Mesa – A collaborative canvas IDE built for agent-first development Hi HN - I'm Ryan a product designer who codes, and I built Mesa. Current IDEs feel wrong for the type of development being done now - the focus is still on files.<p>Mesa puts the focus on the full workflow: your agent, terminal, browser, and files all live as equal nodes on a canvas with full multiplayer support. (think figma but for code)<p>I was tired of the overhead of switching windows, tabs, and terminals across multiple projects. Inspired by TouchDesigner and Factorio, I wanted something more fluid and visual. Been using it as a total replacement for Cursor at work every day now. Being able to see multiple repos at once and control agents on each without navigating windows has freed up my headspace and increased productivity.<p>It's free to try — would love to know what you think!
Run NanoClaw in Docker Sandboxes
Hacker News (score: 134)[DevOps] Run NanoClaw in Docker Sandboxes
Show HN: Oxyde – Pydantic-native async ORM with a Rust core
Hacker News (score: 40)[Database] Show HN: Oxyde – Pydantic-native async ORM with a Rust core Hi HN! I built Oxyde because I was tired of duplicating my models.<p>If you use FastAPI, you know the drill. You define Pydantic models for your API, then define separate ORM models for your database, then write converters between them. SQLModel tries to fix this but it's still SQLAlchemy underneath. Tortoise gives you a nice Django-style API but its own model system. Django ORM is great but welded to the framework.<p>I wanted something simple: your Pydantic model IS your database model. One class, full validation on input and output, native type hints, zero duplication. The query API is Django-style (.objects.filter(), .exclude(), Q/F expressions) because I think it's one of the best designs out there.<p><i>Explicit over implicit.</i> I tried to remove all the magic. Queries don't touch the database until you call a terminal method like .all(), .get(), or .first(). If you don't explicitly call .join() or .prefetch(), related data won't be loaded. No lazy loading, no surprise N+1 queries behind your back. You see exactly what hits the database by reading the code.<p><i>Type safety</i> was a big motivation. Python's weak spot is runtime surprises, so Oxyde tackles this on three levels: (1) when you run makemigrations, it also generates .pyi stub files with fully typed queries, so your IDE knows that filter(age__gte=...) takes an int, that create() accepts exactly the fields your model has, and that .all() returns list[User] not list[Any]; (2) Pydantic validates data going into the database; (3) Pydantic validates data coming back out via model_validate(). You get autocompletion, red squiggles on typos, and runtime guarantees, all from the same model definition.<p><i>Why Rust?</i> Not for speed as a goal. I don't do "language X is better" debates. Each one is good at what it was made for. Python is hard to beat for expressing business logic. But infrastructure stuff like SQL generation, connection pooling, and row serialization is where a systems language makes sense. So I split it: Python handles your models and business logic, Rust handles the database plumbing. Queries are built as an IR in Python, serialized via MessagePack, sent to Rust which generates dialect-specific SQL, executes it, and streams results back. Speed is a side effect of this split, not the goal. But since you're not paying a performance tax for the convenience, here are the benchmarks if curious: <a href="https://oxyde.fatalyst.dev/latest/advanced/benchmarks/" rel="nofollow">https://oxyde.fatalyst.dev/latest/advanced/benchmarks/</a><p>What's there today: Django-style migrations (makemigrations / migrate), transactions with savepoints, joins and prefetch, PostgreSQL + SQLite + MySQL, FastAPI integration, and an auto-generated admin panel that works with FastAPI, Litestar, Sanic, Quart, and Falcon (<a href="https://github.com/mr-fatalyst/oxyde-admin" rel="nofollow">https://github.com/mr-fatalyst/oxyde-admin</a>).<p>It's v0.5, beta, active development, API might still change. This is my attempt to build the ORM I personally wanted to use. Would love feedback, criticism, ideas.<p>Docs: <a href="https://oxyde.fatalyst.dev/" rel="nofollow">https://oxyde.fatalyst.dev/</a><p>Step-by-step FastAPI tutorial (blog API from scratch): <a href="https://github.com/mr-fatalyst/fastapi-oxyde-example" rel="nofollow">https://github.com/mr-fatalyst/fastapi-oxyde-example</a>
Launch HN: Spine Swarm (YC S23) – AI agents that collaborate on a visual canvas
Hacker News (score: 72)[Other] Launch HN: Spine Swarm (YC S23) – AI agents that collaborate on a visual canvas Hey HN! We're Ashwin and Akshay from Spine AI (<a href="https://www.getspine.ai">https://www.getspine.ai</a>). Spine Swarm is a multi-agent system that works on an infinite visual canvas to complete complex non-coding projects: competitive analysis, financial modeling, SEO audits, pitch decks, interactive prototypes, and more. Here's a video of it in action: <a href="https://www.youtube.com/watch?v=R_2-ggpZz0Q" rel="nofollow">https://www.youtube.com/watch?v=R_2-ggpZz0Q</a>.<p>We've been friends for over 13 years. We took our first ML course together at NTU, in a part of campus called North Spine, which is where the name comes from. We went through YC in S23 and have spent about 3 years building Spine across many product iterations.<p>The core idea: chat is the wrong interface for complex AI work. It's a linear thread, and real projects aren't linear. Sure, you can ask a chatbot to reference the financial model from earlier in the thread, or run research and market sizing together, but you're trusting the model to juggle that context implicitly. There's no way to see how it's connecting the pieces, no way to correct one step without rerunning everything, and no way to branch off and explore two strategies side by side. ChatGPT was a demo that blew up, and chat stuck around as the default interface, not because it's the right abstraction. We thought humans and agents needed a real workspace where the structure of the work is explicit and user-controllable, not hidden inside a context window.<p>So we built an infinite visual canvas where you think in blocks instead of threads. Each block is our abstraction on top of AI models. There are dedicated block types for LLM calls, image generation, web browsing, apps, slides, spreadsheets, and more. Think of them as Lego bricks for AI workflows: each one does something specific, but they can be snapped together and composed in many different ways. You can connect any block to any other block, and that connection guarantees the passing of context regardless of block type. The whole system is model-agnostic, so in a single workflow you can go from an OpenAI LLM call, to an image generation mode like Nano Banana Pro, to Claude generating an interactive app, each block using whatever model fits best. Multiple blocks can fan out from the same input, analyzing it in different ways with different models, then feed their outputs into a downstream block that synthesizes the results.<p>The first version of the canvas was fully manual. Users entered prompts, chose models, ran blocks, and made connections themselves. It clicked with founders and product managers because they could branch in different directions from the same starting point: take a product idea and generate a prototype in one branch, a PRD in another, a competitive critique in a third, and a pitch deck in a fourth, all sharing the same upstream context. But new users didn't want to learn the interface. They kept asking us to build a chat layer that would generate and connect blocks on their behalf, to replicate the way we were using the tool. So we built that, and in doing so discovered something we didn't expect: the agents were capable of running autonomously for hours, producing complete deliverables. It turned out agents could run longer and keep their context windows clean by delegating work to blocks and storing intermediary context on the canvas, rather than holding everything in a single context window.<p>Here's how it works now. When you submit a task, a central orchestrator decomposes it into subtasks and delegates each to specialized persona agents. These agents operate on the canvas blocks and can override default settings, primarily the model and prompt, to fit each subtask. Agents pick the best model for each block and sometimes run the same block with multiple models to compare and synthesize outputs. Multiple agents work in parallel when their subtasks don't have dependencies, and downstream agents automatically receive context from upstream work. The user doesn't configure any of this. You can also dispatch multiple tasks at once and the system will queue dependent ones or start independent ones immediately.<p>Agents aren't fully autonomous by default. Any agent can pause execution and ask the user for clarification or feedback before continuing, which keeps the human in the loop where it matters. And once agents have produced output, you can select a subset of blocks on the canvas and iterate on them through the chat without rerunning the entire workflow.<p>The canvas gives agents something that filesystems and message-passing don't: a persistent, structured representation of the entire project that any agent can read and contribute to at any point. In typical multi-agent systems, context degrades as it passes between agents. The canvas addresses this because agents store intermediary results in blocks rather than trying to hold everything in memory, and they leave explicit structured handoffs designed to be consumed efficiently by the next agent in the chain. Every step is also fully auditable, so you can trace exactly how each agent arrived at its conclusions.<p>We ran benchmarks to validate what we were seeing. On Google DeepMind's DeepSearchQA, which is 900 questions spanning 17 fields, each structured as a causal chain where each step depends on completing the previous one, Spine Swarm scored 87.6% on the full dataset with zero human intervention. For the benchmark we used a subset of block types relevant to the questions (LLM calls, web browsing, table) and removed irrelevant ones like document, spreadsheet, and slide generation. We also disabled human clarification so agents ran fully independently. The agents were not just auditable but also state of the art. The auditability also exposed actual errors in an older benchmark (GAIA Level 3), cases where the expected answer was wrong or ambiguous, which you'd never catch with a black-box pipeline. We detail the methodology, architecture, and benchmark errors in the full writeup: <a href="https://blog.getspine.ai/spine-swarm-hits-1-on-gaia-level-3-and-google-deepmind-deepsearchqa">https://blog.getspine.ai/spine-swarm-hits-1-on-gaia-level-3-...</a><p>Benchmarks measure accuracy on closed-ended questions. Turns out the same architecture also leads to better open-ended outputs like decks, reports, and prototypes with minimal supervision. We've seen early users split into two camps: some watch the agents work and jump in to redirect mid-flow, others queue a task and come back to a finished deliverable. Both work because the canvas preserves the full chain of work, so you can audit or intervene whenever you want.<p>A good first task to try: give it your website URL and ask for a full SEO analysis, competitive landscape, and a prioritized growth roadmap with a slide deck. You'll see multiple agents spin up on the canvas simultaneously. People have also used it for fundraising pitch decks with financial models, prototyping features from screenshots and PRDs, competitive analysis reports and deep-dive learning plans that research a topic from multiple angles and produce structured material you can explore further.<p>Pricing is usage-based credits tied to block usage and the underlying models used. Agents tend to use more credits than manual workflows because they're tuned to get you the best possible outcome, which means they pick the best blocks and do more work. Details here: <a href="https://www.getspine.ai/pricing">https://www.getspine.ai/pricing</a>. There's a free tier, and one honest caveat: we sized it to let you try a real task, but tasks vary in complexity. If you run out before you've had a proper chance to explore, email us at founders@getspine.ai and we'll work with you.<p>We'd love your feedback on the experience: what worked, what didn't, and where it fell short. We're also curious how others here approach complex, multi-step AI work beyond coding. What tools are you using, and what breaks first? We'll be in the comments all day.
TUI Studio – visual terminal UI design tool
Hacker News (score: 456)[Other] TUI Studio – visual terminal UI design tool
Show HN: fftool – A Terminal UI for FFmpeg – Shows Command Before It Runs
Hacker News (score: 13)[CLI Tool] Show HN: fftool – A Terminal UI for FFmpeg – Shows Command Before It Runs
Chrome extension adjusts video speed based on how fast the speaker is talking
Hacker News (score: 72)[Other] Chrome extension adjusts video speed based on how fast the speaker is talking
Show HN: Droeftoeter, a Terminal Coding Toy
Hacker News (score: 21)[Other] Show HN: Droeftoeter, a Terminal Coding Toy This is a small coding toy I made for fun. I think there are a few interesting ideas buried in it — curious what others think.
Hyperlinks in Terminal Emulators
Hacker News (score: 36)[Other] Hyperlinks in Terminal Emulators
Show HN: OpenClaw-class agents on ESP32 (and the IDE that makes it possible)
Hacker News (score: 11)[Other] Show HN: OpenClaw-class agents on ESP32 (and the IDE that makes it possible)
Show HN: Every Developer in the World, Ranked
Show HN (score: 9)[Other] Show HN: Every Developer in the World, Ranked We've indexed 5M+ GitHub users and built a ranking system that goes beyond follower counts. The idea started from frustration: GitHub is terrible for discovery. You can't answer "who are the best Python developers in Berlin?" or "who identified transformer-based models before they blew up?" without scraping everything yourself. So we did.<p>What we built: CodeRank score - a composite reputation signal across contributions, repository impact, and community influence Tastemaker score - did you star repos at 50 stars that now have 50,000? We track that Comparison Builder - allows users to build comparison graphics to compare devs, repos, orgs, etc. Sharable Profile Graphics - share your scores and flex on your coworkers or the community at large<p>Some things we found interesting: Most-followed ≠ most influential. The correlation between follower count and tastemaker score is surprisingly weak. There's a whole tier of developers who consistently find projects weeks and months before they trend, with almost no public following.<p>Location data on GitHub is a disaster. We spent an embarrassing amount of time on normalization and it's still not anywhere near perfect.<p>Try it: <a href="https://coderank.me/" rel="nofollow">https://coderank.me/</a><p>If your profile doesn't have a score, signing in will trigger scoring for your account.<p>Curious what the HN crowd thinks about the ranking methodology, happy to get into the weeds on any of it.
Launch HN: IonRouter (YC W26) – High-throughput, low-cost inference
Hacker News (score: 17)[API/SDK] Launch HN: IonRouter (YC W26) – High-throughput, low-cost inference Hey HN — I’m Veer and my cofounder is Suryaa. We're building Cumulus Labs (YC W26), and we're releasing our latest product IonRouter (<a href="https://ionrouter.io/" rel="nofollow">https://ionrouter.io/</a>), an inference API for open-source and fine tuned models. You swap in our base URL, keep your existing OpenAI client code, and get access to any model (open source or finetuned to you) running on our own inference engine.<p>The problem we kept running into: every inference provider is either fast-but-expensive (Together, Fireworks — you pay for always-on GPUs) or cheap-but-DIY (Modal, RunPod — you configure vLLM yourself and deal with slow cold starts). Neither felt right for teams that just want to ship.<p>Suryaa spent years building GPU orchestration infrastructure at TensorDock and production systems at Palantir. I led ML infrastructure and Linux kernel development for Space Force and NASA contracts where the stack had to actually work under pressure. When we started building AI products ourselves, we kept hitting the same wall: GPU infrastructure was either too expensive or too much work.<p>So we built IonAttention — a C++ inference runtime designed specifically around the GH200's memory architecture. Most inference stacks treat GH200 as a compatibility target (make sure vLLM runs, use CPU memory as overflow). We took a different approach and built around what makes the hardware actually interesting: a 900 GB/s coherent CPU-GPU link, 452GB of LPDDR5X sitting right next to the accelerator, and 72 ARM cores you can actually use.<p>Three things came out of that that we think are novel: (1) using hardware cache coherence to make CUDA graphs behave as if they have dynamic parameters at zero per-step cost — something that only works on GH200-class hardware; (2) eager KV block writeback driven by immutability rather than memory pressure, which drops eviction stalls from 10ms+ to under 0.25ms; (3) phantom-tile attention scheduling at small batch sizes that cuts attention time by over 60% in the worst-affected regimes. We wrote up the details at cumulus.blog/ionattention.<p>On multimodal pipelines we get better performance than big players (588 tok/s vs. Together AI's 298 on the same VLM workload). We're honest that p50 latency is currently worse (~1.46s vs. 0.74s) — that's the tradeoff we're actively working on.<p>Pricing is per token, no idle costs: GPT-OSS-120B is $0.02 in / $0.095 out, Qwen3.5-122B is $0.20 in / $1.60 out. Full model list and pricing at <a href="https://ionrouter.io" rel="nofollow">https://ionrouter.io</a>.<p>You can try the playground at <a href="https://ionrouter.io/playground" rel="nofollow">https://ionrouter.io/playground</a> right now, no signup required, or drop your API key in and swap the base URL — it's one line. We built this so teams can see the power of our engine and eventually come to us for their finetuned model needs using the same solution.<p>We're curious what you think, especially if you're running finetuned or custom models — that's the use case we've invested the most in. What's broken, what would make this actually useful for you?
Show HN: An application stack Claude coded directly in LLVM IR
Show HN (score: 8)[Other] Show HN: An application stack Claude coded directly in LLVM IR This repo is the result of a debate about what kind of programming language might be appropriate if humans are no longer the primary authors. Initially the thought was "LLMs can just generate binaries directly" (this was before a more famous person had the same idea). But that on reflection seems like a bad approach because languages exist to capture program semantics that are elided by translation to machine code. The next step was to wonder if an existing "machine readable" program representation can be the target for LLM code generation. It turns out yes. This project is the result of asking Claude to create an application stack entirely coded in LLVM's intermediate representation language.