Show HN: Improving RAG with chess Elo scores
Hacker News (score: 31)Description
I'm Ghita, co-founder of ZeroEntropy (YC W25). We build high accuracy search infrastructure for RAG and AI Agents.
We just released two new state-of-the-art rerankers zerank-1, and zerank-1-small. One of them is fully open-source under Apache 2.0.
We trained those models using a novel Elo score inspired pipeline which we describe in detail in the blog attached. In a nutshell, here is an outline of the training steps: * Collect soft preferences between pairs of documents using an ensemble of LLMs. * Fit an ELO-style rating system (Bradley-Terry) to turn pairwise comparisons into absolute per-document scores. * Normalize relevance scores across queries using a bias correction step, modeled using cross-query comparisons and solved with MLE.
You can try the models either through our API (https://docs.zeroentropy.dev/models), or via HuggingFace (https://huggingface.co/zeroentropy/zerank-1-small).
We would love this community's feedback on the models, and the training approach. A full technical report is also going to be released soon.
Thank you!
More from Hacker
Bypass PostgreSQL catalog overhead with direct partition hash calculations
Bypass PostgreSQL catalog overhead with direct partition hash calculations
Automatically Packaging a Haskell Library as a Swift Binary XCFramework
Automatically Packaging a Haskell Library as a Swift Binary XCFramework
Show HN: Unlearning Comparator, a visual tool to compare machine unlearning
Show HN: Unlearning Comparator, a visual tool to compare machine unlearning I built Unlearning Comparator, a visual analytics toolkit to help researchers and developers compare how different machine unlearning methods work. It provides a unified workflow to test for accuracy, efficiency, and privacy. You can check out the live demo linked in the post, and the source code is on GitHub: <a href="https://github.com/gnueaj/Machine-Unlearning-Comparator">https://github.com/gnueaj/Machine-Unlearning-Comparator</a> Our accompanying paper is currently under review at IEEE TVCG. Happy to answer any questions and would love to hear your feedback!
No other tools from this source yet.