Show HN: C discrete event SIM w stackful coroutines runs 45x faster than SimPy

Hacker News (score: 25)
Found: February 03, 2026
ID: 3219

Description

Other
Show HN: C discrete event SIM w stackful coroutines runs 45x faster than SimPy Hi all,

I have built Cimba, a multithreaded discrete event simulation library in C.

Cimba uses POSIX pthread multithreading for parallel execution of multiple simulation trials, while coroutines provide concurrency inside each simulated trial universe. The simulated processes are based on asymmetric stackful coroutines with the context switching hand-coded in assembly.

The stackful coroutines make it natural to express agentic behavior by conceptually placing oneself "inside" that process and describing what it does. A process can run in an infinite loop or just act as a one-shot customer passing through the system, yielding and resuming execution from any level of its call stack, acting both as an active agent and a passive object as needed. This is inspired by my own experience programming in Simula67, many moons ago, where I found the coroutines more important than the deservedly famous object-orientation.

Cimba turned out to run really fast. In a simple benchmark, 100 trials of an M/M/1 queue run for one million time units each, it ran 45 times faster than an equivalent model built in SimPy + Python multiprocessing. The running time was reduced by 97.8 % vs the SimPy model. Cimba even processed more simulated events per second on a single CPU core than SimPy could do on all 64 cores.

The speed is not only due to the efficient coroutines. Other parts are also designed for speed, such as a hash-heap event queue (binary heap plus Fibonacci hash map), fast random number generators and distributions, memory pools for frequently used object types, and so on.

The initial implementation supports the AMD64/x86-64 architecture for Linux and Windows. I plan to target Apple Silicon next, then probably ARM.

I believe this may interest the HN community. I would appreciate your views on both the API and the code. Any thoughts on future target architectures to consider?

Docs: https://cimba.readthedocs.io/en/latest/

Repo: https://github.com/ambonvik/cimba

More from Hacker

Show HN: GibRAM an in-memory ephemeral GraphRAG runtime for retrieval

Show HN: GibRAM an in-memory ephemeral GraphRAG runtime for retrieval Hi HN,<p>I have been working with regulation-heavy documents lately, and one thing kept bothering me. Flat RAG pipelines often fail to retrieve related articles together, even when they are clearly connected through references, definitions, or clauses.<p>After trying several RAG setups, I subjectively felt that GraphRAG was a better mental model for this kind of data. The Microsoft GraphRAG paper and reference implementation were helpful starting points. However, in practice, I found one recurring friction point: graph storage and vector indexing are usually handled by separate systems, which felt unnecessarily heavy for short-lived analysis tasks.<p>To explore this tradeoff, I built GibRAM (Graph in-buffer Retrieval and Associative Memory). It is an experimental, in-memory GraphRAG runtime where entities, relationships, text units, and embeddings live side by side in a single process.<p>GibRAM is intentionally ephemeral. It is designed for exploratory tasks like summarization or conversational querying over a bounded document set. Data lives in memory, scoped by session, and is automatically cleaned up via TTL. There are no durability guarantees, and recomputation is considered cheaper than persistence for the intended use cases.<p>This is not a database and not a production-ready system. It is a casual project, largely vibe-coded, meant to explore what GraphRAG looks like when memory is the primary constraint instead of storage. Technical debt exists, and many tradeoffs are explicit.<p>The project is open source, and I would really appreciate feedback, especially from people working on RAG, search infrastructure, or graph-based retrieval.<p>GitHub: <a href="https:&#x2F;&#x2F;github.com&#x2F;gibram-io&#x2F;gibram" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;gibram-io&#x2F;gibram</a><p>Happy to answer questions or hear why this approach might be flawed.

Upgrading DrizzleORM logging with AsyncLocalStorage

Upgrading DrizzleORM logging with AsyncLocalStorage

Show HN: Porting xv6 to HiFive Unmatched board

Show HN: Porting xv6 to HiFive Unmatched board Hi HN,<p>I ported the teaching OS xv6-riscv to HiFive Unmatched and got it running on real hardware, including passing usertests.<p>I&#x27;ve been self-studying OS internals using the MIT 6.1810 materials. After finishing most of the labs, I was eager to see what it&#x27;s like to run the OS on bare metal, rather than QEMU.<p>The Unmatched may not have the latest RISC-V features, but it&#x27;s well-documented, and the Rev B release has made it more affordable, which makes it a good learning platform.<p>The porting process involved several interesting challenges:<p>- Hardware Quirks: Handling things like enabling A&#x2F;D bits in PTEs (the hardware doesn&#x27;t set them automatically, causing page faults), proper handling of interrupts, and instruction cache synchronization.<p>- Boot Flow: xv6 expects M-mode on startup, but standard RISC-V boot flows (typically via OpenSBI) jump to S-mode. To bridge this gap, I created a minimal U-Boot FIT image that contains only the xv6 kernel. This way, U-Boot SPL handles the complex CPU&#x2F;DDR initialization, then hands control to xv6 in M-mode (skipping OpenSBI).<p>- Drivers: Ported an SPI SD card driver, replacing the virtio disk driver.<p>I wrote up implementation notes here: <a href="https:&#x2F;&#x2F;github.com&#x2F;eyengin&#x2F;xv6-riscv-unmatched&#x2F;blob&#x2F;unmatched&#x2F;doc&#x2F;NOTES.md" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;eyengin&#x2F;xv6-riscv-unmatched&#x2F;blob&#x2F;unmatche...</a><p>Hopefully, this is useful for others who are learning OS internals and want to try running their code on real RISC-V hardware.

Show HN: DeepDream for Video with Temporal Consistency

Show HN: DeepDream for Video with Temporal Consistency I forked a PyTorch DeepDream implementation and added video support with temporal consistency. It produces smooth DeepDream videos with minimal flickering, and is highly flexible including many parameters and supports multiple pretrained image classifiers including GoogLeNet. Check out the repo for sample videos! Features:<p>- Optical flow warps previous hallucinations into the current frame<p>- Occlusion masking prevents ghosting and hallucination transfer when objects move<p>- Advanced parameters (layers, octaves, iterations) still work<p>- Works on GPU, CPU, and Apple Silicon

No other tools from this source yet.