- zembed-1 achieves 0.6723 NDCG@10 on legal benchmarks — +12.8% over voyage-4-nano and +31.8% over OpenAI
- Elo-calibrated relevance (zELO) correctly ranks the full spectrum of legal document relevance, from tangentially related to directly on-point
- 32,768-token context window embeds full contracts, briefs, and regulatory filings as coherent units
- Binary quantization compresses 10M legal documents from ~40 TB to under 1.3 TB of vector storage
- Available open-weight on HuggingFace (CC-BY-NC-4.0) and via the ZeroEntropy commercial API
The Best Embedding Model for Legal
Legal AI is having a moment, but under the surface of every impressive legal research tool, contract analysis platform, and e-discovery system is an embedding model doing the heavy lifting. The quality of that embedding model determines whether your legal AI surfaces the right precedents, locates the relevant clause, and understands the subtle distinctions that make one case or contract provision actually applicable.
In the legal domain, zembed-1 by ZeroEntropy has achieved the highest benchmark scores of any embedding model evaluated to date, and by a meaningful margin.
The Unique Challenge of Legal Text
Legal language is a dialect unto itself. It is simultaneously hyper-precise and deeply contextual. The difference between “shall” and “may,” between “indemnify” and “hold harmless,” between “material adverse effect” with and without carve-outs — these distinctions carry enormous practical weight. Legal AI systems must retrieve documents and passages where these nuances are semantically distinguished, not flattened.
- Case law retrieval where the query describes a factual scenario and the target is a holding or reasoning chain from a case decades old
- Contract clause search where the user is looking for a specific covenant, condition, or representation across thousands of agreements
- Regulatory lookup where subtle jurisdictional differences determine relevance
- Statutory interpretation where legislative history documents need to be ranked by their proximity to a specific interpretive question
Generic embedding models fail here because they weren’t trained on the specific relevance signals that matter in legal contexts. zembed-1 was.
zembed-1’s Legal Benchmark Performance
On the legal domain benchmark using NDCG@10 — the gold standard metric for retrieval quality — zembed-1 achieves the following:
| Model | Legal NDCG@10 |
|---|---|
| zembed-1 | 0.6723 |
| voyage-4-nano | 0.5957 |
| Cohere Embed v4 | 0.5894 |
| OpenAI text-embedding-3-large | 0.5099 |
zembed-1’s +12.8% improvement over voyage-4-nano and +31.8% over OpenAI is the difference between a tool lawyers will actually trust and one they’ll abandon after the first embarrassing miss.
zembed-1’s legal score is also its second-highest across all domains tested, reflecting the model’s particular strength in precisely structured, high-stakes professional text.
Why zembed-1 Dominates Legal Retrieval
Elo-Calibrated Relevance, Not Binary Labels
The foundational insight behind zembed-1’s zELO training methodology is that relevance is never binary in professional contexts — and especially not in law. When a litigator searches for “unconscionability doctrine in consumer contracts,” there is a rich spectrum of relevance: tangentially related cases, directly on-point holdings, closely analogous fact patterns, foundational doctrine cases.
32k Token Context for Long-Form Legal Documents
Contracts, briefs, regulatory filings, and legal opinions are long. Much legal AI work involves embedding documents that run tens of thousands of words. zembed-1’s 32,768-token context window allows you to embed full agreements, complete decisions, or entire regulatory sections as coherent units — avoiding the chunking artifacts that degrade retrieval quality when documents are broken into tiny fragments.
This is a significant operational advantage for legal AI. Many competing models cap out at 8,192 tokens or fewer, forcing aggressive chunking strategies that obscure document structure and disrupt the semantic coherence that makes legal retrieval work.
Quantization for Massive Legal Corpora
Law firms and legal tech platforms operate over enormous document sets — decades of case law, thousands of client contracts, regulatory archives spanning multiple jurisdictions. zembed-1’s quantization flexibility makes this tractable:
- Binary quantization compresses each 8 KB vector to under 128 bytes (32x reduction)
- A corpus of 10 million legal documents can be stored in under 1.3 TB of vector storage rather than ~40 TB
This makes zembed-1 practical for large-scale legal AI deployments that would otherwise require prohibitive infrastructure costs.
Key Legal AI Use Cases
Legal Research and Case Retrieval
Search across millions of cases using natural language descriptions of legal issues, fact patterns, or holdings. zembed-1’s relevance calibration ensures that the most directly applicable precedents surface first.
Contract Intelligence
Query thousands of contracts for specific clause types, covenant definitions, or negotiated terms. zembed-1’s semantic understanding distinguishes between similar-sounding provisions that have different practical effects.
E-Discovery
Build high-recall document retrieval systems for litigation support. zembed-1’s superior retrieval quality means fewer relevant documents are missed in large-scale review workflows.
Regulatory Compliance
Search regulatory guidance, agency opinions, and compliance frameworks across jurisdictions. zembed-1 handles the technical language of financial regulation, environmental law, employment law, and more.
Contract Drafting Assistance
Power clause libraries and precedent retrieval tools that suggest the most relevant template language for a given drafting context.
What Legal AI Practitioners Are Saying
“We tested zembed-1 against our existing pipeline on 500 manually annotated legal research queries. It found the right case in the top-3 results 71% of the time. Our previous model managed 54%.” — AI Lead, CLM platform
“Contract review went from ‘AI-assisted’ to ‘AI-reliable’ when we switched. The model actually understands what we mean when we ask about specific covenant types.” — Attorney, Law Firm
Implementation
zembed-1 is available open-weight on HuggingFace under CC-BY-NC-4.0 for research and non-commercial use, and via the ZeroEntropy commercial API for production legal platforms.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer(
"zeroentropy/zembed-1",
trust_remote_code=True,
model_kwargs={"torch_dtype": "bfloat16"},
)
# Legal research query
query_embeddings = model.encode_query(
"Force majeure clause excluding pandemic events in commercial leases"
)
document_embeddings = model.encode_document([
"Section 12.3 - Force Majeure: Neither party shall be liable for delays caused by circumstances beyond their reasonable control, including but not limited to acts of God, war, government action, or pandemic...",
"The tenant's obligation to pay rent is absolute and shall not be excused by any force majeure event...",
])
similarities = model.similarity(query_embeddings, document_embeddings) The Verdict
The legal domain demands the highest precision from AI retrieval systems, and zembed-1 delivers it. With a benchmark score of 0.6723 NDCG@10 that outperforms every other tested model, a 32k token context window that handles real-world legal document lengths, and compression options that make large-scale deployment practical, zembed-1 is the embedding model that serious legal AI platforms should be building on.
Get Started
zembed-1 is available today through multiple deployment options:
from zeroentropy import ZeroEntropy
zclient = ZeroEntropy()
response = zclient.models.embed(
model="zembed-1",
input_type="query", # "query" or "document"
input="What is retrieval augmented generation?", # string or list[str]
dimensions=2560, # optional: must be one of [2560, 1280, 640, 320, 160, 80, 40]
encoding_format="float", # "float" or "base64"
latency="fast", # "fast" or "slow"; omit for auto
)Documentation: docs.zeroentropy.dev
HuggingFace: huggingface.co/zeroentropy
Get in touch: Discord community or contact@zeroentropy.dev
Talk to us if you need a custom deployment, volume pricing, or want to see how zembed-1 + zerank-2 performs on your data.
