Show HN: Arc – high-throughput time-series warehouse with DuckDB analytics

Hacker News (score: 10)
Found: October 07, 2025
ID: 1757

Description

Database
Show HN: Arc – high-throughput time-series warehouse with DuckDB analytics Hi HN, I’m Ignacio, founder at Basekick Labs.

Over the past months I’ve been building Arc, a time-series data platform designed to combine very fast ingestion with strong analytical queries.

What Arc does? Ingest via a binary MessagePack API (fast path), Compatible with Line Protocol for existing tools (Like InfluxDB, I'm ex Influxer), Store data as Parquet with hourly partitions, Query via DuckDB engine using SQL

Why I built it:

Many systems force you to trade retention, throughput, or complexity. I wanted something where ingestion performance doesn’t kill your analytics.

Performance & benchmarks that I have so far.

Write throughput: ~1.88M records/sec (MessagePack, untuned) in my M3 Pro Max (14 cores, 36gb RAM) ClickBench on AWS c6a.4xlarge: 35.18 s cold, ~0.81 s hot (43/43 queries succeeded) In those runs, caching was disabled to match benchmark rules; enabling cache in production gives ~20% faster repeated queries

I’ve open-sourced the Arc repo so you can dive into implementation, benchmarks, and code. Would love your thoughts, critiques, and use-case ideas.

Thanks!

More from Hacker

Show HN: Minikv – Distributed key-value and object store in Rust (Raft, S3 API)

Show HN: Minikv – Distributed key-value and object store in Rust (Raft, S3 API) Hi HN,<p>I’m releasing minikv, a distributed key-value and object store in Rust.<p>What is minikv? minikv is an open-source, distributed storage engine built for learning, experimentation, and self-hosted setups. It combines a strongly-consistent key-value database (Raft), S3-compatible object storage, and basic multi-tenancy. I started minikv as a learning project about distributed systems, and it grew into something production-ready and fun to extend.<p>Features&#x2F;highlights:<p>- Raft consensus with automatic failover and sharding - S3-compatible HTTP API (plus REST&#x2F;gRPC APIs) - Pluggable storage backends: in-memory, RocksDB, Sled - Multi-tenant: per-tenant namespaces, role-based access, quotas, and audit - Metrics (Prometheus), TLS, JWT-based API keys - Easy to deploy (single binary, works with Docker&#x2F;Kubernetes)<p>Quick demo (single node):<p>git clone <a href="https:&#x2F;&#x2F;github.com&#x2F;whispem&#x2F;minikv.git" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;whispem&#x2F;minikv.git</a> cd minikv cargo run --release -- --config config.example.toml curl localhost:8080&#x2F;health&#x2F;ready # S3 upload + read curl -X PUT localhost:8080&#x2F;s3&#x2F;mybucket&#x2F;hello -d &quot;hi HN&quot; curl localhost:8080&#x2F;s3&#x2F;mybucket&#x2F;hello<p>Docs, cluster setup, and architecture details are in the repo. I’d love to hear feedback, questions, ideas, or your stories running distributed infra in Rust!<p>Repo: <a href="https:&#x2F;&#x2F;github.com&#x2F;whispem&#x2F;minikv" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;whispem&#x2F;minikv</a> Crate: <a href="https:&#x2F;&#x2F;crates.io&#x2F;crates&#x2F;minikv" rel="nofollow">https:&#x2F;&#x2F;crates.io&#x2F;crates&#x2F;minikv</a>

CLI's completion should know what options you've typed

CLI's completion should know what options you've typed

Show HN: Python SDK – forecasting with foundation time-series and tabular models

Show HN: Python SDK – forecasting with foundation time-series and tabular models We’ve built a Python SDK for running inference on foundation models designed for time-series and tabular data. They are new SOTA models for time-series and tabular tasks and work out of the box. They do not require model training or feature engineering. The link to the GitHub repository is: <a href="https:&#x2F;&#x2F;github.com&#x2F;S-FM&#x2F;faim-python-client" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;S-FM&#x2F;faim-python-client</a>

Show HN: Deterministic PCIe Diagnostics for GPUs on Linux

Show HN: Deterministic PCIe Diagnostics for GPUs on Linux I built a small Linux tool to deterministically verify GPU PCIe link health and bandwidth.<p>It reports: - Negotiated PCIe generation and width - Peak Host→Device and Device→Host memcpy bandwidth - Sustained PCIe TX&#x2F;RX utilization via NVML - A rule-based verdict derived from observable hardware data only<p>This exists because PCIe issues (Gen downgrades, reduced lane width, risers, bifurcation) are often invisible at the application layer and can’t be fixed by kernel tuning or async overlap.<p>Linux-only: it relies on sysfs and PCIe AER exposure that Windows does not provide.

No other tools from this source yet.