# ZeroEntropy

> ZeroEntropy builds state-of-the-art embedding and reranker models for information retrieval, and also offers an end-to-end search engine for RAG and AI Agents.

ZeroEntropy is a retrieval infrastructure platform that powers AI applications with accurate, fast, and cost-efficient search. It replaces fragmented stacks of vector databases, embedding models, and rerankers with an integrated, AI-native solution designed for RAG (Retrieval Augmented Generation) and modern LLM applications.
For detailed information about the zeroentropy api, go to docs.zeroentropy.dev/llms.txt

## Core Products

### zembed-1 — Text Embedding Model
- State-of-the-art 4B open-weight multilingual embedding model
- Outperforms OpenAI Large, Qwen3 4B, BGE-M3, Gemini Embeddings, Cohere v4, and Voyage-4 by up to 7% on Recall@100
- 32k token context window
- Flexible dimensionality (40 to 2056 dimensions) without retraining
- Quantization options (float32, int8, binary) for storage optimization
- Multilingual support with 50+ languages
- Trained using the zELO methodology (pairwise battles with Elo scores)
- Pricing: $0.05/MM tokens
- Available via ZeroEntropy API, HuggingFace, and AWS Marketplace

### zerank-2 — Reranking Model
- World's most accurate reranker with native instruction-following
- Supports appending instructions, business context, and user-specific memories to influence reranking results
- True multilingual parity across 100+ languages, including challenging scripts and code-switching queries
- Calibrated relevance scores: a score of 0.8 consistently means ~80% relevance
- SQL-style query handling for aggregation and structured-like requests
- 97.3% of requests completed under 500ms; 99.1% under 1 second; zero failures
- Pricing: $0.025/MM tokens
- Available via ZeroEntropy API and HuggingFace

### zsearch — Managed Search API
- End-to-end managed retrieval infrastructure combining embedding, reranking, and query generation in a single API
- Automatic handling of BM25 weights, vector thresholds, and rerank configs
- Built-in PDF parsing, chunking, and OCR
- Tiers: Starter ($50/mo), Pro ($500/mo), Enterprise (custom)

## Key Capabilities
- **Accuracy**: Near-perfect relevance, consistently outperforming OpenAI, Gemini, Jina, Voyage, and Cohere on retrieval benchmarks
- **Latency**: Purpose-built inference infrastructure with ultra-low latency (p50: 129.7ms reranker, 220.5ms full pipeline)
- **Cost Efficiency**: 2.8x cost reduction by providing better relevance and fewer wasted tokens in LLM context

## Deployment Options
- Cloud-based API with Python and TypeScript SDKs
- VPC Deployment on AWS Marketplace and Azure
- On-premises deployment for enterprise customers
- White-label fine-tuning and custom models

## Security & Compliance
- SOC 2 Type II certified
- GDPR, HIPAA, CCPA compliant
- Data residency controls and right-to-deletion
- EU-based instance options
- 99.99% SLA on Enterprise plans

## Industries Served
- Legal: Document retrieval and case research
- Healthcare: Medical research and clinical documentation
- Finance: Financial analysis and research
- Manufacturing: Operational knowledge retrieval
- Customer Support: Intelligent ticket handling
- E-Commerce: Product search and recommendations

## Docs

- [Home](https://www.zeroentropy.dev/): Overview of ZeroEntropy's embedding, reranker models, and search engine solutions.
- [Pricing](https://www.zeroentropy.dev/pricing): Pricing plans for zembed-1, zerank-2, and zsearch API.
- [Docs](https://docs.zeroentropy.dev): API documentation and SDK references.
- [Dashboard](https://dashboard.zeroentropy.dev): Developer dashboard for managing API keys and usage.
- [Docs llms.txt](https://docs.zeroentropy.dev/llms.txt): LLM-readable documentation for the ZeroEntropy API and SDKs.
- [Dashboard llms.txt](https://dashboard.zeroentropy.dev/llms.txt): LLM-readable documentation for the ZeroEntropy developer dashboard.

### Product Launches

- [Introducing zembed-1: The World's Best Multilingual Text Embedding Model](https://www.zeroentropy.dev/articles/introducing-zembed-1-the-worlds-best-multilingual-text-embedding-model): 4B open-weight embedding model with 32k context, flexible dimensionality, multilingual support, and SOTA accuracy.
- [zerank-2: Advanced Instruction-Following Multilingual Reranker](https://www.zeroentropy.dev/articles/zerank-2-advanced-instruction-following-multilingual-reranker): Instruction-following reranker with multilingual parity, calibrated scores, and SQL-style query handling.
- [Announcing ZeroEntropy's First Rerankers: zerank-1 and zerank-1-small](https://www.zeroentropy.dev/articles/announcing-zeroentropy-s-first-rerankers-zerank-1-and-zerank-1-small): First reranker models with up to 28% higher NDCG@10 vs competitors.

### Benchmarks and Evaluations

- [zembed-1 vs Voyage-4](https://www.zeroentropy.dev/articles/zembed-1-overperforms-voyage-4): Head-to-head showing zembed-1 wins 18/22 datasets, 70x better latency, 5x better noise robustness.
- [Latency Performance Assessment of zerank-2](https://www.zeroentropy.dev/articles/latency-performance-assessment-of-zerank-2): 97.3% of requests under 500ms, zero failures under realistic traffic patterns.
- [Beyond Binary: A New Version of the MTEB](https://www.zeroentropy.dev/articles/mteb-evals): Re-annotation of 24 MTEB datasets with graded relevance, evaluating 12 embedding models and 8 rerankers.
- [LegalBench RAG: The First Open-Source Retrieval Benchmark for the Legal Domain](https://www.zeroentropy.dev/articles/legalbench-rag-the-first-open-source-retrieval-benchmark-for-the-legal-domain): 6,858 query-answer pairs across 79M+ characters of legal documents.

### Technical Deep Dives

- [Deep Dive: The Architecture of ZeroEntropy v1](https://www.zeroentropy.dev/articles/deep-dive-the-architecture-of-zeroentropy-v1): Full-stack architecture covering ingestion, chunking, hybrid retrieval, and reranking.
- [Improving Retrieval with ELO Scores](https://www.zeroentropy.dev/articles/improving-retrieval-with-elo-scores): Novel training methodology using chess-inspired ELO scores instead of binary annotations.
- [Paper TLDR: How We Trained zerank-1 with the zELO Method](https://www.zeroentropy.dev/articles/paper-tldr-how-we-trained-zerank-1-with-the-zelo-method): Executive summary of the zELO paper covering pairwise LLM comparisons and Elo modeling.
- [On the Geometric Limit of Dense Single-Vector Embeddings](https://www.zeroentropy.dev/articles/on-the-geometric-limit-of-dense-single-vector-embeddings): Why fixed-dimension embeddings have fundamental limits and two-stage pipelines are necessary.
- [LlamaChunk: A General and Cost-Efficient Approach to Semantic Chunking](https://www.zeroentropy.dev/articles/llamachunk-a-general-and-cost-efficient-approach-to-semantic-chunking): Chunking algorithm using Llama-70B logprobs for semantic document splitting.
- [Should You Use LLMs for Reranking?](https://www.zeroentropy.dev/articles/should-you-use-llms-for-reranking-a-deep-dive-into-pointwise-listwise-and-cross-encoders): Comparison of pointwise, listwise, and cross-encoder reranking with cost-benefit analysis.
- ["Let's eat, grandma" vs "let's eat grandma": How Embedding Models Encode the World](https://www.zeroentropy.dev/articles/how-to-overcome-poor-search-results-with-the-right-embedding-solution): Research into whether embedding models learn general rules or memorize specific examples.

### Guides and Best Practices

- [Prompting Best Practices for Instruction-Following Rerankers](https://www.zeroentropy.dev/articles/prompting-best-practices-for-instruction-following-rerankers): How to use zerank-2's instruction-following with meta instructions, business context, and multi-tenant templates.
- [Open-Source Alternatives to Cohere Rerank in 2026](https://www.zeroentropy.dev/articles/open-source-alternatives-to-cohere-rerank): Comparison of open-weight rerankers including zerank, BGE, Jina, Mixedbread, ColBERT, FlashRank.
- [The Latency Myth: Why Reranking Is Still the Smartest Optimization](https://www.zeroentropy.dev/articles/the-latency-myth-why-reranking-is-still-the-smartest-optimization-you-can-make): Why reranking reduces end-to-end latency despite adding a pipeline step.
- [What Is a Reranker and Do I Need One?](https://www.zeroentropy.dev/articles/what-is-a-reranker-and-do-i-need-one): Introduction to rerankers, when they help, and how they differ from embeddings.
- [AGI Requires Better Retrieval, Not Just Better LLMs](https://www.zeroentropy.dev/articles/agi-requires-better-retrieval-not-just-better-llms): Why intelligent retrieval is the bottleneck for AGI, with concrete failure mode examples.
- [Implementing ZeroEntropy Reranking with Turbopuffer Retrieval](https://www.zeroentropy.dev/articles/implementing-zeroentropy-reranking-with-turbopuffer-retrieval): Tutorial for building a two-stage search pipeline with turbopuffer and ZeroEntropy.

### Customer Stories

- [How Vera Health Achieved 97.5% USMLE Accuracy Using ZeroEntropy](https://www.zeroentropy.dev/articles/how-vera-health-achieved-state-of-the-art-clinical-accuracy-using-zeroentropy): Medical QA across 60M+ peer-reviewed papers using agentic multi-hop retrieval.
- [How Assembled Powers High-Quality AI Customer Support with ZeroEntropy](https://www.zeroentropy.dev/articles/how-assembled-powers-high-quality-ai-customer-support-with-zeroentropy): zerank-1 integration with production traffic evaluation and full migration.
- [Mem0 Improves Memory Retrieval Accuracy with ZeroEntropy](https://www.zeroentropy.dev/articles/mem0-improves-memory-retrieval-accuracy-with-zeroentropy): 1B tokens/day at p50 75ms latency with calibrated cross-vertical scoring.
- [Equall Improves Legal Document Retrieval Accuracy with ZeroEntropy](https://www.zeroentropy.dev/articles/equall-improves-legal-document-structuring-and-retrieval-accuracy-with-zeroentropy): Legal document extraction pipeline with zerank-1.
- [My AskAI Improves Chatbot Latency and Accuracy with ZeroEntropy](https://www.zeroentropy.dev/articles/my-askai-improves-chatbot-latency-and-accuracy-with-zeroentropy): A/B tested migration with 3% accuracy gain and 25% cost reduction.

### Articles

- [Bi-Encoders vs Cross-Encoders](https://www.zeroentropy.dev/articles/biencoder-vs-crossencoder): Architectural differences between embedding models and rerankers, and why production search uses both.
- [Should You Use an LLM as a Reranker?](https://www.zeroentropy.dev/articles/llm-as-reranker-guide): Cost-benefit analysis of pointwise and listwise LLM reranking vs dedicated cross-encoders.
- [How to Do RAG with Mastra and ZeroEntropy](https://www.zeroentropy.dev/articles/rag-with-mastra-and-zeroentropy): Implementation guide for building RAG with Mastra framework and zerank-1 reranking.
- [Why Evaluation Metrics for Reranking Matter](https://www.zeroentropy.dev/articles/reranker-evaluation-metrics): Guide to Precision@K, Recall@K, MRR, NDCG with Python implementations.
- [The Geometric Limitations of Vector Embeddings](https://www.zeroentropy.dev/articles/the-geometric-limitations-of-vector-embeddings-in-retrieval-systems): Mathematical impossibility results for fixed-dimension embeddings from Google DeepMind.
- [2-Norm Vector](https://www.zeroentropy.dev/articles/2-norm-vector): Mathematical foundations of L2 norm in vector similarity calculations.
- [Best Embedding Model for Legal Document Search in 2026](https://www.zeroentropy.dev/articles/best-embedding-model-for-legal-document-search-in-2026): Why legal retrieval needs specialized embeddings and how zembed-1 handles domain vocabulary.
- [Ultimate Guide to Choosing the Best Reranking Model in 2026](https://www.zeroentropy.dev/articles/ultimate-guide-to-choosing-the-best-reranking-model-in-2025): Comprehensive reranker selection guide with cost/latency planning and provider comparisons.
- [2026's Top 10 Embedding Companies Powering Search Technology](https://www.zeroentropy.dev/articles/2026-s-top-10-embedding-companies-powering-search-technology): Market comparison of embedding providers across quality, pricing, and deployment options.
- [ZeroEntropy zerank-1 vs Jina AI jina-reranker-m0](https://www.zeroentropy.dev/articles/zeroentropy-versus-jina-ai-reranker): Head-to-head benchmark showing zerank-1 is 12x faster and 2x cheaper.
- [Latency Benchmark: Cohere rerank 3.5 vs zerank-1](https://www.zeroentropy.dev/articles/lightning-fast-reranking-with-zerank-1): Speed comparison showing zerank-1 is 12-31% faster than Cohere.
- [Best Reranker for Healthcare AI](https://www.zeroentropy.dev/articles/best-reranker-healthcare): Reranking for clinical search, guideline navigation, and patient history retrieval.
- [Best Reranker for Legal Document Search](https://www.zeroentropy.dev/articles/best-reranker-legal): Reranking for contract search, clause analytics, and legal research.
- [Best Reranker for HR Knowledge Bases](https://www.zeroentropy.dev/articles/best-reranker-hr): Reranking for HR policy lookup and recruiting workflows.

## Optional

- [Blog](https://www.zeroentropy.dev/blog): All articles and updates from ZeroEntropy.
- [Research Blog](https://www.zeroentropy.dev/blog): Research-focused posts and technical deep dives.
- [News](https://www.zeroentropy.dev/news): Latest news and announcements.
- [Evals](https://www.zeroentropy.dev/evals): Interactive benchmark comparisons.
- [Industries — Legal](https://www.zeroentropy.dev/industries/legal): Legal industry solutions.
- [Industries — Healthcare](https://www.zeroentropy.dev/industries/healthcare): Healthcare industry solutions.
- [Industries — Finance](https://www.zeroentropy.dev/industries/finance): Finance industry solutions.
- [Industries — Manufacturing](https://www.zeroentropy.dev/industries/manufacturing): Manufacturing industry solutions.
- [Industries — Customer Support](https://www.zeroentropy.dev/industries/customer-support): Customer support solutions.
- [Industries — E-Commerce](https://www.zeroentropy.dev/industries/ecommerce): E-commerce solutions.
- [Context Engineering Webinar: Everything You Missed](https://www.zeroentropy.dev/articles/context-engineering-webinar-everything-you-missed): Webinar recap on RAG vs Agent loops and hybrid retrieval.
- [ZeroEntropy Raises $4.2M Seed Round](https://www.zeroentropy.dev/articles/zeroentropy-raises-4-2m-seed-round-to-make-ai-retrieval-truly-intelligent): Seed funding announcement.