# ZeroEntropy > ZeroEntropy builds state-of-the-art embedding and reranker models for information retrieval, and also offers an end-to-end search engine for RAG and AI Agents. ZeroEntropy is a retrieval infrastructure platform that powers AI applications with accurate, fast, and cost-efficient search. It replaces fragmented stacks of vector databases, embedding models, and rerankers with an integrated, AI-native solution designed for RAG (Retrieval Augmented Generation) and modern LLM applications. For detailed information about the zeroentropy api, go to docs.zeroentropy.dev/llms.txt ## Core Products ### zembed-1 — Text Embedding Model - State-of-the-art 4B open-weight multilingual embedding model - Outperforms OpenAI Large, Qwen3 4B, BGE-M3, Gemini Embeddings, Cohere v4, and Voyage-4 by up to 7% on Recall@100 - 32k token context window - Flexible dimensionality (40 to 2056 dimensions) without retraining - Quantization options (float32, int8, binary) for storage optimization - Multilingual support with 50+ languages - Trained using the zELO methodology (pairwise battles with Elo scores) - Pricing: $0.05/MM tokens - Available via ZeroEntropy API, HuggingFace, and AWS Marketplace ### zerank-2 — Reranking Model - World's most accurate reranker with native instruction-following - Supports appending instructions, business context, and user-specific memories to influence reranking results - True multilingual parity across 100+ languages, including challenging scripts and code-switching queries - Calibrated relevance scores: a score of 0.8 consistently means ~80% relevance - SQL-style query handling for aggregation and structured-like requests - 97.3% of requests completed under 500ms; 99.1% under 1 second; zero failures - Pricing: $0.025/MM tokens - Available via ZeroEntropy API and HuggingFace ### zsearch — Managed Search API - End-to-end managed retrieval infrastructure combining embedding, reranking, and query generation in a single API - Automatic handling of BM25 weights, vector thresholds, and rerank configs - Built-in PDF parsing, chunking, and OCR - Tiers: Starter ($50/mo), Pro ($500/mo), Enterprise (custom) ## Key Capabilities - **Accuracy**: Near-perfect relevance, consistently outperforming OpenAI, Gemini, Jina, Voyage, and Cohere on retrieval benchmarks - **Latency**: Purpose-built inference infrastructure with ultra-low latency (p50: 129.7ms reranker, 220.5ms full pipeline) - **Cost Efficiency**: 2.8x cost reduction by providing better relevance and fewer wasted tokens in LLM context ## Deployment Options - Cloud-based API with Python and TypeScript SDKs - VPC Deployment on AWS Marketplace and Azure - On-premises deployment for enterprise customers - White-label fine-tuning and custom models ## Security & Compliance - SOC 2 Type II certified - GDPR, HIPAA, CCPA compliant - Data residency controls and right-to-deletion - EU-based instance options - 99.99% SLA on Enterprise plans ## Industries Served - Legal: Document retrieval and case research - Healthcare: Medical research and clinical documentation - Finance: Financial analysis and research - Manufacturing: Operational knowledge retrieval - Customer Support: Intelligent ticket handling - E-Commerce: Product search and recommendations ## Docs - [Home](https://www.zeroentropy.dev/): Overview of ZeroEntropy's embedding, reranker models, and search engine solutions. - [Pricing](https://www.zeroentropy.dev/pricing): Pricing plans for zembed-1, zerank-2, and zsearch API. - [Docs](https://docs.zeroentropy.dev): API documentation and SDK references. - [Dashboard](https://dashboard.zeroentropy.dev): Developer dashboard for managing API keys and usage. - [Docs llms.txt](https://docs.zeroentropy.dev/llms.txt): LLM-readable documentation for the ZeroEntropy API and SDKs. - [Dashboard llms.txt](https://dashboard.zeroentropy.dev/llms.txt): LLM-readable documentation for the ZeroEntropy developer dashboard. ### Product Launches - [Introducing zembed-1: The World's Best Multilingual Text Embedding Model](https://www.zeroentropy.dev/articles/introducing-zembed-1-the-worlds-best-multilingual-text-embedding-model): 4B open-weight embedding model with 32k context, flexible dimensionality, multilingual support, and SOTA accuracy. - [zerank-2: Advanced Instruction-Following Multilingual Reranker](https://www.zeroentropy.dev/articles/zerank-2-advanced-instruction-following-multilingual-reranker): Instruction-following reranker with multilingual parity, calibrated scores, and SQL-style query handling. - [Announcing ZeroEntropy's First Rerankers: zerank-1 and zerank-1-small](https://www.zeroentropy.dev/articles/announcing-zeroentropy-s-first-rerankers-zerank-1-and-zerank-1-small): First reranker models with up to 28% higher NDCG@10 vs competitors. ### Benchmarks and Evaluations - [zembed-1 vs Voyage-4](https://www.zeroentropy.dev/articles/zembed-1-overperforms-voyage-4): Head-to-head showing zembed-1 wins 18/22 datasets, 70x better latency, 5x better noise robustness. - [Latency Performance Assessment of zerank-2](https://www.zeroentropy.dev/articles/latency-performance-assessment-of-zerank-2): 97.3% of requests under 500ms, zero failures under realistic traffic patterns. - [Beyond Binary: A New Version of the MTEB](https://www.zeroentropy.dev/articles/mteb-evals): Re-annotation of 24 MTEB datasets with graded relevance, evaluating 12 embedding models and 8 rerankers. - [LegalBench RAG: The First Open-Source Retrieval Benchmark for the Legal Domain](https://www.zeroentropy.dev/articles/legalbench-rag-the-first-open-source-retrieval-benchmark-for-the-legal-domain): 6,858 query-answer pairs across 79M+ characters of legal documents. ### Technical Deep Dives - [Deep Dive: The Architecture of ZeroEntropy v1](https://www.zeroentropy.dev/articles/deep-dive-the-architecture-of-zeroentropy-v1): Full-stack architecture covering ingestion, chunking, hybrid retrieval, and reranking. - [Improving Retrieval with ELO Scores](https://www.zeroentropy.dev/articles/improving-retrieval-with-elo-scores): Novel training methodology using chess-inspired ELO scores instead of binary annotations. - [Paper TLDR: How We Trained zerank-1 with the zELO Method](https://www.zeroentropy.dev/articles/paper-tldr-how-we-trained-zerank-1-with-the-zelo-method): Executive summary of the zELO paper covering pairwise LLM comparisons and Elo modeling. - [On the Geometric Limit of Dense Single-Vector Embeddings](https://www.zeroentropy.dev/articles/on-the-geometric-limit-of-dense-single-vector-embeddings): Why fixed-dimension embeddings have fundamental limits and two-stage pipelines are necessary. - [LlamaChunk: A General and Cost-Efficient Approach to Semantic Chunking](https://www.zeroentropy.dev/articles/llamachunk-a-general-and-cost-efficient-approach-to-semantic-chunking): Chunking algorithm using Llama-70B logprobs for semantic document splitting. - [Should You Use LLMs for Reranking?](https://www.zeroentropy.dev/articles/should-you-use-llms-for-reranking-a-deep-dive-into-pointwise-listwise-and-cross-encoders): Comparison of pointwise, listwise, and cross-encoder reranking with cost-benefit analysis. - ["Let's eat, grandma" vs "let's eat grandma": How Embedding Models Encode the World](https://www.zeroentropy.dev/articles/how-to-overcome-poor-search-results-with-the-right-embedding-solution): Research into whether embedding models learn general rules or memorize specific examples. ### Guides and Best Practices - [Prompting Best Practices for Instruction-Following Rerankers](https://www.zeroentropy.dev/articles/prompting-best-practices-for-instruction-following-rerankers): How to use zerank-2's instruction-following with meta instructions, business context, and multi-tenant templates. - [Open-Source Alternatives to Cohere Rerank in 2026](https://www.zeroentropy.dev/articles/open-source-alternatives-to-cohere-rerank): Comparison of open-weight rerankers including zerank, BGE, Jina, Mixedbread, ColBERT, FlashRank. - [The Latency Myth: Why Reranking Is Still the Smartest Optimization](https://www.zeroentropy.dev/articles/the-latency-myth-why-reranking-is-still-the-smartest-optimization-you-can-make): Why reranking reduces end-to-end latency despite adding a pipeline step. - [What Is a Reranker and Do I Need One?](https://www.zeroentropy.dev/articles/what-is-a-reranker-and-do-i-need-one): Introduction to rerankers, when they help, and how they differ from embeddings. - [AGI Requires Better Retrieval, Not Just Better LLMs](https://www.zeroentropy.dev/articles/agi-requires-better-retrieval-not-just-better-llms): Why intelligent retrieval is the bottleneck for AGI, with concrete failure mode examples. - [Implementing ZeroEntropy Reranking with Turbopuffer Retrieval](https://www.zeroentropy.dev/articles/implementing-zeroentropy-reranking-with-turbopuffer-retrieval): Tutorial for building a two-stage search pipeline with turbopuffer and ZeroEntropy. ### Customer Stories - [How Vera Health Achieved 97.5% USMLE Accuracy Using ZeroEntropy](https://www.zeroentropy.dev/articles/how-vera-health-achieved-state-of-the-art-clinical-accuracy-using-zeroentropy): Medical QA across 60M+ peer-reviewed papers using agentic multi-hop retrieval. - [How Assembled Powers High-Quality AI Customer Support with ZeroEntropy](https://www.zeroentropy.dev/articles/how-assembled-powers-high-quality-ai-customer-support-with-zeroentropy): zerank-1 integration with production traffic evaluation and full migration. - [Mem0 Improves Memory Retrieval Accuracy with ZeroEntropy](https://www.zeroentropy.dev/articles/mem0-improves-memory-retrieval-accuracy-with-zeroentropy): 1B tokens/day at p50 75ms latency with calibrated cross-vertical scoring. - [Equall Improves Legal Document Retrieval Accuracy with ZeroEntropy](https://www.zeroentropy.dev/articles/equall-improves-legal-document-structuring-and-retrieval-accuracy-with-zeroentropy): Legal document extraction pipeline with zerank-1. - [My AskAI Improves Chatbot Latency and Accuracy with ZeroEntropy](https://www.zeroentropy.dev/articles/my-askai-improves-chatbot-latency-and-accuracy-with-zeroentropy): A/B tested migration with 3% accuracy gain and 25% cost reduction. ### Articles - [Bi-Encoders vs Cross-Encoders](https://www.zeroentropy.dev/articles/biencoder-vs-crossencoder): Architectural differences between embedding models and rerankers, and why production search uses both. - [Should You Use an LLM as a Reranker?](https://www.zeroentropy.dev/articles/llm-as-reranker-guide): Cost-benefit analysis of pointwise and listwise LLM reranking vs dedicated cross-encoders. - [How to Do RAG with Mastra and ZeroEntropy](https://www.zeroentropy.dev/articles/rag-with-mastra-and-zeroentropy): Implementation guide for building RAG with Mastra framework and zerank-1 reranking. - [Why Evaluation Metrics for Reranking Matter](https://www.zeroentropy.dev/articles/reranker-evaluation-metrics): Guide to Precision@K, Recall@K, MRR, NDCG with Python implementations. - [The Geometric Limitations of Vector Embeddings](https://www.zeroentropy.dev/articles/the-geometric-limitations-of-vector-embeddings-in-retrieval-systems): Mathematical impossibility results for fixed-dimension embeddings from Google DeepMind. - [2-Norm Vector](https://www.zeroentropy.dev/articles/2-norm-vector): Mathematical foundations of L2 norm in vector similarity calculations. - [Best Embedding Model for Legal Document Search in 2026](https://www.zeroentropy.dev/articles/best-embedding-model-for-legal-document-search-in-2026): Why legal retrieval needs specialized embeddings and how zembed-1 handles domain vocabulary. - [Ultimate Guide to Choosing the Best Reranking Model in 2026](https://www.zeroentropy.dev/articles/ultimate-guide-to-choosing-the-best-reranking-model-in-2025): Comprehensive reranker selection guide with cost/latency planning and provider comparisons. - [2026's Top 10 Embedding Companies Powering Search Technology](https://www.zeroentropy.dev/articles/2026-s-top-10-embedding-companies-powering-search-technology): Market comparison of embedding providers across quality, pricing, and deployment options. - [ZeroEntropy zerank-1 vs Jina AI jina-reranker-m0](https://www.zeroentropy.dev/articles/zeroentropy-versus-jina-ai-reranker): Head-to-head benchmark showing zerank-1 is 12x faster and 2x cheaper. - [Latency Benchmark: Cohere rerank 3.5 vs zerank-1](https://www.zeroentropy.dev/articles/lightning-fast-reranking-with-zerank-1): Speed comparison showing zerank-1 is 12-31% faster than Cohere. - [Best Reranker for Healthcare AI](https://www.zeroentropy.dev/articles/best-reranker-healthcare): Reranking for clinical search, guideline navigation, and patient history retrieval. - [Best Reranker for Legal Document Search](https://www.zeroentropy.dev/articles/best-reranker-legal): Reranking for contract search, clause analytics, and legal research. - [Best Reranker for HR Knowledge Bases](https://www.zeroentropy.dev/articles/best-reranker-hr): Reranking for HR policy lookup and recruiting workflows. ## Optional - [Blog](https://www.zeroentropy.dev/blog): All articles and updates from ZeroEntropy. - [Research Blog](https://www.zeroentropy.dev/blog): Research-focused posts and technical deep dives. - [News](https://www.zeroentropy.dev/news): Latest news and announcements. - [Evals](https://www.zeroentropy.dev/evals): Interactive benchmark comparisons. - [Industries — Legal](https://www.zeroentropy.dev/industries/legal): Legal industry solutions. - [Industries — Healthcare](https://www.zeroentropy.dev/industries/healthcare): Healthcare industry solutions. - [Industries — Finance](https://www.zeroentropy.dev/industries/finance): Finance industry solutions. - [Industries — Manufacturing](https://www.zeroentropy.dev/industries/manufacturing): Manufacturing industry solutions. - [Industries — Customer Support](https://www.zeroentropy.dev/industries/customer-support): Customer support solutions. - [Industries — E-Commerce](https://www.zeroentropy.dev/industries/ecommerce): E-commerce solutions. - [Context Engineering Webinar: Everything You Missed](https://www.zeroentropy.dev/articles/context-engineering-webinar-everything-you-missed): Webinar recap on RAG vs Agent loops and hybrid retrieval. - [ZeroEntropy Raises $4.2M Seed Round](https://www.zeroentropy.dev/articles/zeroentropy-raises-4-2m-seed-round-to-make-ai-retrieval-truly-intelligent): Seed funding announcement.