Show HN: HoundDog.ai – Ultra-Fast Code Scanner for Data Privacy
Show HN (score: 13)Description
I'm one of the creators of HoundDog.ai (https://github.com/hounddogai/hounddog). We currently handle privacy scanning for Replit's 45M+ creators.
We built HoundDog because privacy compliance is usually a choice between manual spreadsheets or reactive runtime scanning. While runtime tools are useful for monitoring, they only catch leaks after the code is live and the data has already moved. They can also miss code paths that aren't actively triggered in production.
HoundDog traces sensitive data in code during development and helps catch risky flows (e.g., PII leaking into logs or unapproved third-party SDKs) before the code is shipped.
The core scanner is a standalone Rust binary. It doesn't use LLMs so it's local, deterministic, cheap, and fast. It can scan 1M+ lines of code in seconds on a standard laptop, and supports 80+ sensitive data types (PII, PHI, CHD) and hundreds of data sinks (logs, SDKs, APIs, ORMs etc.) out of the box.
We use AI internally to expand and scale our rules, identifying new data sources and sinks, but the execution is pure static analysis.
The scanner is free to use (no signups) so please try it out and send us feedback. I'll be around to answer any questions!
More from Show
Show HN: Commit-based code review instead of PR-based
Show HN: Commit-based code review instead of PR-based Hi HN,<p>I’m experimenting with commit-based code review as an alternative to PR-based review.<p>Instead of analyzing large PR diffs, this reviews each commit incrementally, while context is still fresh. It’s fully configurable and intentionally low-noise, high signal - focused on catching issues that tend to slip through and compound over time.<p>The goal isn’t to replace CI or PR review, but to move some feedback earlier:<p>risky changes hidden in small diffs<p>architectural or consistency drift<p>performance or security footguns<p>Happy to answer questions
Show HN: Xkcd #2347 lived in my head, so I built the dependency tower for real
Show HN: Xkcd #2347 lived in my head, so I built the dependency tower for real I finally got tired of XKCD #2347 living rent-free in my head, so I built Stacktower: a tool that takes any real package’s dependency graph and turns it into an actual tower of bricks. Along the way I had to wrestle some surprisingly spicy problems.<p>Full blog post here: <a href="https://stacktower.io" rel="nofollow">https://stacktower.io</a><p>The result is half visualization tool, half love letter to the chaos of modern dependency trees. Open-source, works with PyPI, Cargo, npm, and more.<p>Code: <a href="https://github.com/matzehuels/stacktower" rel="nofollow">https://github.com/matzehuels/stacktower</a>
Show HN: Flowctl – Open-source self-service workflow automation platform
Show HN: Flowctl – Open-source self-service workflow automation platform Flowctl is a self-service platform that gives users secure access to complex workflows, all in a single binary. These workflows could be anything, granting SSH access to an instance, provisioning infra, or custom business process automation. The executor paradigm in flowctl makes it domain-agnostic.<p>This initial release includes: - SSO with OIDC and RBAC - Execution on remote nodes via SSH (fully agentless) - Approvals - Cron-based scheduling - Flow editor UI - Encrypted credentials and secrets store - Docker and Script executors - Namespaces<p>I built this because I needed a simple tool to manage my homelab while traveling, something that acts as a UI for scripts. At work, I was also looking for tools to turn repetitive ops/infra tasks into self-service offerings. I tried tools like Backstage and Rundeck, but they were either too complex, or the OSS versions lacked important features.<p>Flowctl can simply be described as a pipeline (like CI/CD systems) that people can trigger on-demand with custom inputs.<p>Would love to hear how you might use something like this!<p>Demo - <a href="https://demo.flowctl.net" rel="nofollow">https://demo.flowctl.net</a><p>Homepage - <a href="https://flowctl.net" rel="nofollow">https://flowctl.net</a><p>GitHub - <a href="https://github.com/cvhariharan/flowctl" rel="nofollow">https://github.com/cvhariharan/flowctl</a>
Show HN: Pyversity – Fast Result Diversification for Retrieval and RAG
Show HN: Pyversity – Fast Result Diversification for Retrieval and RAG Hey HN! I’ve recently open-sourced Pyversity, a lightweight library for diversifying retrieval results. Most retrieval systems optimize only for relevance, which can lead to top-k results that look almost identical. Pyversity efficiently re-ranks results to balance relevance and diversity, surfacing items that remain relevant but are less redundant. This helps with improving retrieval, recommendation, and RAG pipelines without adding latency or complexity.<p>Main features:<p>- Unified API: one function (diversify) supporting several well-known strategies: MMR, MSD, DPP, and COVER (with more to come)<p>- Lightweight: the only dependency is NumPy, keeping the package small and easy to install<p>- Fast: efficient implementations for all supported strategies; diversify results in milliseconds<p>Re-ranking with cross-encoders is very popular right now, but also very expensive. From my experience, you can usually improve retrieval results with simpler and faster methods, such as the ones implemented in this package. This helps retrieval, recommendation, and RAG systems present richer, more informative results by ensuring each new item adds new information.<p>Code and docs: github.com/pringled/pyversity<p>Let me know if you have any feedback, or suggestions for other diversification strategies to support!
No other tools from this source yet.