Abstract image of a blurry, diagonal light streak transitioning from red to teal on a black background.

The Models For Human-Level Search

The Models For Human-Level Search

The Models For Human-Level Search

State-of-the-art rerankers and embeddings for your retrieval stack.

Fully integrated into our agentic search engine for RAG and AI agents.

State-of-the-art rerankers and embeddings for your retrieval stack.

Fully integrated into our agentic search engine for RAG and AI agents.

State-of-the-art rerankers and embeddings for your retrieval stack.

Fully integrated into our agentic search engine for RAG and AI agents.

Copy

Copied

Copy

Copied

Copy

Copied

# Create an API Key at https://dashboard.zeroentropy.dev

from zeroentropy import ZeroEntropy

zclient = ZeroEntropy()

response = zclient.models.rerank(
    model="zerank-1",
    query="Which reranker is the fastest?",
    documents=[
        "Jina rerank-m0 • 300 ms",
        "Cohere rerank-3.5 • 100 ms",
        "ZeroEntropy zerank-1 • 60 ms",
    ],
)
print(response.model_dump_json(indent=4))
1K
1K

Developers

1M
1M
1M

Queries

1M
1M
1M

Documents

1B
1B
1B

Tokens

Diagram showing unstructured data being transformed into embeddings, then stored in a vector database, and finally processed by a reranker to produce relevant results.
Diagram showing unstructured data being transformed into embeddings, then stored in a vector database, and finally processed by a reranker to produce relevant results.
Diagram showing unstructured data being transformed into embeddings, then stored in a vector database, and finally processed by a reranker to produce relevant results.
Reranker

zerank-1 is our state-of-the-art open-weight reranker that outperforms models like Cohere's rerank 3.5 or Jina's rerank m0 and significantly boosts search accuracy in a single line of code. 

Try it Now

Embedding

zembed-1 is our new embedding model that reduces vector database storage cost by up to 10x. Reach out to us for an early preview.

Get in Touch

E2E Search

With our end-to-end search engine, you can ship AI search that actually works in just a few lines of code, and focus on building your product.

Ship Now

Reranker

zerank-1 is our state-of-the-art open-weight reranker that outperforms models like Cohere's rerank 3.5 or Jina's rerank m0 and significantly boosts search accuracy in a single line of code. 

Try it Now

Embedding

zembed-1 is our new embedding model that reduces vector database storage cost by up to 10x. Reach out to us for an early preview.

Get in Touch

E2E Search

With our end-to-end search engine, you can ship AI search that actually works in just a few lines of code, and focus on building your product.

Ship Now

Reranker

zerank-1 is our state-of-the-art open-weight reranker that outperforms models like Cohere's rerank 3.5 or Jina's rerank m0 and significantly boosts search accuracy in a single line of code. 

Try it Now

Embedding

zembed-1 is our new embedding model that reduces vector database storage cost by up to 10x. Reach out to us for an early preview.

Get in Touch

E2E Search

With our end-to-end search engine, you can ship AI search that actually works in just a few lines of code, and focus on building your product.

Ship Now

Ship Search That Actually Works for your Chatbot or Agent

Rerank

Embed

Search

Copy

Copied

# Create an API Key at https://dashboard.zeroentropy.dev

from zeroentropy import ZeroEntropy

zclient = ZeroEntropy()

response = zclient.models.rerank(
    model="zerank-1",
    query="Which reranker is the fastest?",
    documents=[
        "Jina rerank-m0 • 300 ms",
        "Cohere rerank-3.5 • 100 ms",
        "ZeroEntropy zerank-1 • 60 ms",
    ],
)
print(response.model_dump_json(indent=4))

Rerank

Embed

Search

Copy

Copied

# Create an API Key at https://dashboard.zeroentropy.dev

from zeroentropy import ZeroEntropy

zclient = ZeroEntropy()

response = zclient.models.rerank(
    model="zerank-1",
    query="Which reranker is the fastest?",
    documents=[
        "Jina rerank-m0 • 300 ms",
        "Cohere rerank-3.5 • 100 ms",
        "ZeroEntropy zerank-1 • 60 ms",
    ],
)
print(response.model_dump_json(indent=4))

Rerank

Embed

Search

Copy

Copied

# Create an API Key at https://dashboard.zeroentropy.dev

from zeroentropy import ZeroEntropy

zclient = ZeroEntropy()

response = zclient.models.rerank(
    model="zerank-1",
    query="Which reranker is the fastest?",
    documents=[
        "Jina rerank-m0 • 300 ms",
        "Cohere rerank-3.5 • 100 ms",
        "ZeroEntropy zerank-1 • 60 ms",
    ],
)
print(response.model_dump_json(indent=4))
Accuracy

Our models are open-weight, and state-of-the-art, you can read a full benchmark in our blog. 

Latency


Reranker

Retrieval API

Retrieval API + Reranker

p50

12kB payload 150kB payload

Cold Warm

Cold Warm

p95

12kB payload 150kB payload

Cold Warm

Cold Warm

p99

12kB payload 150kB payload

Cold Warm

Cold Warm

Cost

Our pricing is simple and transparent, you can learn more in our pricing page.

Accuracy

Our models are open-weight, and state-of-the-art, you can read a full benchmark in our blog. 

Latency


Reranker

Retrieval API

Retrieval API + Reranker

p50

12kB payload 150kB payload

Cold Warm

Cold Warm

p95

12kB payload 150kB payload

Cold Warm

Cold Warm

p99

12kB payload 150kB payload

Cold Warm

Cold Warm

Cost

Our pricing is simple and transparent, you can learn more in our pricing page.

Accuracy

Our models are open-weight, and state-of-the-art, you can read a full benchmark in our blog. 

Latency


Reranker

Retrieval API

Retrieval API + Reranker

p50

12kB payload 150kB payload

Cold Warm

Cold Warm

p95

12kB payload 150kB payload

Cold Warm

Cold Warm

p99

12kB payload 150kB payload

Cold Warm

Cold Warm

Cost

Our pricing is simple and transparent, you can learn more in our pricing page.

Seriously Secure

ZeroEntropy is built with enterprise-grade security at its core. From SOC 2 Type II compliance to HIPAA readiness, we protect your data with the highest standards — so you can focus on building, not worrying.

ZeroEntropy SOC2
ZeroEntropy HIPPA

Use Cases

Customer Support

Companies in the Customer Support space like MyAskAI have seen significant latency and accuracy improvements when switching to ZeroEntropy's reranker.

Legal

Companies in the legal industry rely on ZeroEntropy's search API and reranker to provide very accurate responses in high stakes use cases like contract understanding or legal research.

Healthcare

Vera Health uses ZeroEntropy for both simple retrieval across millions of medical research papers, but also for Deep Research use cases using our MCP server.

Voice AI

Leaping AI and other Voice AI companies trust ZeroEntropy in production to connect their agents to large knowledge bases for hundreds of thousands of calls every day.

Put Your Retrieval in Autopilot Now

Put Your Retrieval in Autopilot Now

Put Your Retrieval in Autopilot Now

Work directly with the founders to shape the future of agentic retrieval.

Work directly with the founders to shape the future of agentic retrieval.

Work directly with the founders to shape the future of agentic retrieval.

Common Questions

1. What makes ZeroEntropy different from traditional search engines?

Traditional search uses static keyword or semantic matching. ZeroEntropy is optimized for retrieval quality out of the box — combining dense, sparse, and reranked relevance in a single API.

We treat every query as a learning opportunity:

  • You get state-of-the-art relevance, not a bag-of-words match.

  • You don’t need to tune BM25 weights, vector thresholds, or rerank configs — we handle that.

  • You don’t maintain an infra Frankenstein of vector DBs, LLMs, pipelines — we unify it.

2. Does ZeroEntropy handle PDF parsing and chunking?

This is the answer.

3. How does ZeroEntropy process the data I send? Can you deploy on premise?

We take security very seriously. ZeroEntropy is SOC 2 Type 2 and HIPAA compliant.
We also offer a fully managed EU-based instance to comply with regional boundaries.
For additional control, ZeroEntropy can be deployed on-premise.

4. Is there a free trial?

Yes. You can try our Starter plan free for two weeks, including 1,000 queries and 1M tokens of ingestion.

5. What is the query latency?

Here is a table summarizing latencies for both the search engine and reranker:


Reranker

(75 kb payload)

Retrieval API

(205MB of UTF8 bytes)

Retrieval API + Reranker

p50

129.7 ms

156.1 ms

220.5 ms

p90

146.1 ms

181.4 ms

253.1 ms

p99

193.9 ms

276.2 ms

320.2 ms

6. What kind of support is offered?

We offer standard support for Starter and Teams plans, and advanced white-glove onboarding and integration support for Enterprise clients.
You can also join our Slack community to get support.

7. Is it easy to integrate with my product?

Yes. Our developer-first documentation, API reference, and Slack community make integration seamless.

1. What makes ZeroEntropy different from traditional search engines?

Traditional search uses static keyword or semantic matching. ZeroEntropy is optimized for retrieval quality out of the box — combining dense, sparse, and reranked relevance in a single API.

We treat every query as a learning opportunity:

  • You get state-of-the-art relevance, not a bag-of-words match.

  • You don’t need to tune BM25 weights, vector thresholds, or rerank configs — we handle that.

  • You don’t maintain an infra Frankenstein of vector DBs, LLMs, pipelines — we unify it.

2. Does ZeroEntropy handle PDF parsing and chunking?

This is the answer.

3. How does ZeroEntropy process the data I send? Can you deploy on premise?

We take security very seriously. ZeroEntropy is SOC 2 Type 2 and HIPAA compliant.
We also offer a fully managed EU-based instance to comply with regional boundaries.
For additional control, ZeroEntropy can be deployed on-premise.

4. Is there a free trial?

Yes. You can try our Starter plan free for two weeks, including 1,000 queries and 1M tokens of ingestion.

5. What is the query latency?

Here is a table summarizing latencies for both the search engine and reranker:


Reranker

(75 kb payload)

Retrieval API

(205MB of UTF8 bytes)

Retrieval API + Reranker

p50

129.7 ms

156.1 ms

220.5 ms

p90

146.1 ms

181.4 ms

253.1 ms

p99

193.9 ms

276.2 ms

320.2 ms

6. What kind of support is offered?

We offer standard support for Starter and Teams plans, and advanced white-glove onboarding and integration support for Enterprise clients.
You can also join our Slack community to get support.

7. Is it easy to integrate with my product?

Yes. Our developer-first documentation, API reference, and Slack community make integration seamless.

1. What makes ZeroEntropy different from traditional search engines?

Traditional search uses static keyword or semantic matching. ZeroEntropy is optimized for retrieval quality out of the box — combining dense, sparse, and reranked relevance in a single API.

We treat every query as a learning opportunity:

  • You get state-of-the-art relevance, not a bag-of-words match.

  • You don’t need to tune BM25 weights, vector thresholds, or rerank configs — we handle that.

  • You don’t maintain an infra Frankenstein of vector DBs, LLMs, pipelines — we unify it.

2. Does ZeroEntropy handle PDF parsing and chunking?

This is the answer.

3. How does ZeroEntropy process the data I send? Can you deploy on premise?

We take security very seriously. ZeroEntropy is SOC 2 Type 2 and HIPAA compliant.
We also offer a fully managed EU-based instance to comply with regional boundaries.
For additional control, ZeroEntropy can be deployed on-premise.

4. Is there a free trial?

Yes. You can try our Starter plan free for two weeks, including 1,000 queries and 1M tokens of ingestion.

5. What is the query latency?

Here is a table summarizing latencies for both the search engine and reranker:


Reranker

(75 kb payload)

Retrieval API

(205MB of UTF8 bytes)

Retrieval API + Reranker

p50

129.7 ms

156.1 ms

220.5 ms

p90

146.1 ms

181.4 ms

253.1 ms

p99

193.9 ms

276.2 ms

320.2 ms

6. What kind of support is offered?

We offer standard support for Starter and Teams plans, and advanced white-glove onboarding and integration support for Enterprise clients.
You can also join our Slack community to get support.

7. Is it easy to integrate with my product?

Yes. Our developer-first documentation, API reference, and Slack community make integration seamless.

Get started with

Animation of the ZeroEntropy logo
Animation of the ZeroEntropy logo

Our retrieval engine runs autonomously with the 

accuracy of a human-curated system.

Our retrieval engine runs autonomously with the 

accuracy of a human-curated system.

Our retrieval engine runs autonomously with the accuracy of a human-curated system.

Contact us for a custom enterprise solution with custom pricing

Contact us for a custom enterprise solution with custom pricing

Contact us for a custom enterprise solution with custom pricing

Abstract image of a dark background with blurry teal, blue, and pink gradients.