Show HN: Keep large tool output out of LLM context: 3x accuracy 95% fewer tokens

Show HN (score: 6)
Found: March 05, 2026
ID: 3624

Description

Show HN: Keep large tool output out of LLM context: 3x accuracy 95% fewer tokens LLM agents often place raw JSON tool outputs directly in the prompt. After a few tool calls, earlier results get compacted or truncated and answers become incorrect or inconsistent.

I built Sift, a drop-in MCP gateway that stores tool outputs as local artifacts (filesystem blobs indexed in SQLite) and returns an `artifact_id` plus compact schema hints when responses are large or paginated.

Instead of reasoning over full JSON in the prompt, the model runs a small Python query:

    def run(data, schema, params):
        return max(data, key=lambda x: x["magnitude"])["place"]
Query code runs in a constrained subprocess (AST/import guards + timeout/memory caps). Only the computed result is returned to the model.

Benchmark (Claude Sonnet 4.6, 103 questions across 12 datasets):

- Baseline (raw JSON in prompt): 34/103 (33%), 10.7M input tokens

- Sift (artifact + code query): 102/103 (99%), 489K input tokens

Open benchmark + MIT code: https://github.com/lourencomaciel/sift-gateway

Install:

    pipx install sift-gateway
    sift-gateway init --from claude
Works with Claude Code, Cursor, Windsurf, Zed, and VS Code. Existing MCP servers and tools require no changes.

More from Show

Show HN: Tracemap – run and visualize traceroutes from probes around the world

Show HN: Tracemap – run and visualize traceroutes from probes around the world Hi HN,<p>I thought it would be fun to plot a traceroute on a map to visually see the path packets take. I know this idea has been done before, but I still wanted to scratch that itch.<p>The first version just let you paste in a traceroute and it would plot the hops on a map. Later I discovered Globalping (<a href="https:&#x2F;&#x2F;globalping.io" rel="nofollow">https:&#x2F;&#x2F;globalping.io</a>), which allows you to run traceroutes and MTRs from probes around the world, so I integrated that into the tool.<p>From playing around with it, I noticed a few interesting things:<p>• It&#x27;s very easy to spot incorrect IP geolocation. If a hop shows 1–2 ms latency but appears to jump across continents, the geolocation is probably wrong.<p>• Suboptimal routing is sometimes much easier to notice visually than by just looking at latency numbers.<p>• Even with really good databases like IPinfo, IP geolocation is still not perfect, so parts of the path may occasionally be misleading.<p>Huge credit to the teams behind Globalping and IPinfo — Globalping for the measurement infrastructure and IPinfo for the geolocation data.<p>Feedback welcome.

No other tools from this source yet.