Latency Performance Assessment of zerank-2

Dec 9, 2025 · GitHub Twitter Slack LinkedIn Discord
Latency Performance Assessment of zerank-2
TL;DR

zerank-2 delivers consistent, low-latency performance under realistic production conditions. In our testing, 97.3% of requests completed under 500ms with zero failures. This document presents our latency measurements and explains how to properly benchmark reranker performance.

Why Proper Latency Testing Matters

When evaluating reranker latency, it’s critical that your testing reflects actual production usage patterns. Real user traffic doesn’t arrive at uniform intervals. It comes in bursts and clusters. Testing with sequential requests or artificial patterns will give you misleading results that don’t predict real-world performance.

Our tests use Poisson arrival patterns because they model the random, bursty nature of production traffic. This approach reveals how systems behave under realistic load conditions, including queueing effects and concurrent request handling.

Testing Methodology

All tests conducted using:

  • Poisson arrival patterns at 1-10 requests/second
  • 60-second test duration
  • 50 documents per request
  • Payload size ≤2KB per document

Performance Results

ZeRank-2 Latency Distribution

Latency ThresholdRequests Exceeding Threshold
>75ms100.0%
>100ms100.0%
>150ms50.5%
>200ms21.2%
>250ms11.3%
>500ms2.7%
>750ms1.4%
>1s0.9%
>3s0.0%
>5s0.0%
>10s0.0%
>30s0.0%
Failed0.0%

Comparative Performance

Thresholdzerank-2Cohere rerank-3.5Jina reranker m0Voyage rerank-2.5
>150ms50.5%34.3%100.0%80.5%
>500ms2.7%14.3%70.8%10.9%
>1s0.9%11.6%57.4%9.7%
>10s0.0%6.4%55.7%9.2%
Failed0.0%0.0%55.7%9.2%

Key Metrics

  • Zero failures across all test conditions
  • 97.3% of requests completed under 500ms
  • 99.1% of requests completed under 1 second
  • 100% of requests completed under 3 seconds

zerank-2 maintains consistent performance across the entire latency distribution, with no requests exceeding 3 seconds.

Related Blogs

Catch all the latest releases and updates from ZeroEntropy.

ZeroEntropy
The best AI teams retrieve with ZeroEntropy
Follow us on
GitHubTwitterSlackLinkedInDiscord