The Best Embedding Model for Code in 2026: zembed-1 Tops the Leaderboard

Apr 10, 2026 · GitHub Twitter Slack LinkedIn Discord
The Best Embedding Model for Code in 2026: zembed-1 Tops the Leaderboard
TL;DR
  • zembed-1 achieves 0.6452 NDCG@10 on code benchmarks, topping voyage-4-nano (0.6415) and OpenAI (0.6155, +4.8%)
  • Bridges the semantic gap between natural language descriptions and actual code implementations
  • 32,768-token context window embeds full functions, classes, and modules as coherent units
  • Over 50% non-English training data enables code search from queries in any major language
  • The only model that leads every benchmarked domain simultaneously — no tradeoff between code and general performance

The Best Embedding Model for Code

Code search is a deceptively hard problem. Unlike natural language retrieval, where you’re mapping semantically similar text to similar text, code retrieval requires mapping between fundamentally different modalities: a natural language description of what you want to do and code that actually does it. The semantic gap is wide. Generic embedding models fail to bridge it reliably. Specialized code models often lack the breadth to handle the full diversity of real-world codebases.

zembed-1 by ZeroEntropy has achieved the highest benchmark score of any embedding model tested in the code domain — and it does so without sacrificing performance in any other domain.

What Makes Code Retrieval Hard

Developers building code search systems quickly discover that the problem has layers:

Code Retrieval Challenges
  • The vocabulary mismatch problem: A developer searching for “how to parse JSON from an HTTP response in Python” needs to find code that uses requests, json.loads(), and response.json() — none of which appear in the query
  • The intent problem: Code can accomplish the same task in vastly different ways. A relevant snippet might implement the exact function described, or it might be a related utility, a dependency, or a test that validates the behavior. Relevance isn’t binary
  • The multi-language problem: Real codebases are polyglot. A search over a monorepo might need to retrieve relevant code in Python, TypeScript, Go, Rust, and SQL — all from a single natural language query
  • The documentation problem: Code retrieval systems often need to retrieve both code and the documentation, comments, and README content that explains it. The embedding model must handle both modalities

zembed-1 was designed to handle all of these — and the benchmarks confirm it.

Code Benchmark Performance

ModelCode NDCG@10
zembed-10.6452
voyage-4-nano0.6415
Cohere Embed v40.6277
OpenAI text-embedding-3-large0.6155

The code domain shows the tightest competitive spread across all benchmarked categories, reflecting that this is a domain where top models have invested significant effort. Even here, zembed-1 claims the top position.

zembed-1 is the only model that is best-in-class in code while simultaneously leading every other domain — finance, healthcare, legal, conversational, manufacturing, and STEM.

Why zembed-1 Excels at Code Retrieval

zELO: Understanding Code Relevance, Not Just Code Similarity

The central challenge in code retrieval is that two snippets can be textually very different and semantically very similar, or textually similar and semantically unrelated. A function that sorts a list using bubble sort and one using mergesort may look different but accomplish the same thing. A function called process_data might be completely irrelevant to what the user is looking for.

32k Token Context for Full-Function Embedding

Code snippets vary enormously in length. A relevant function might be 10 lines or 500 lines. zembed-1’s 32,768-token context window allows full functions, classes, and even small modules to be embedded as coherent units — preserving the structural and logical context that makes code retrieval work.

Models with shorter context windows force chunking strategies that break functions mid-logic, lose the relationship between functions and their docstrings, and separate a method from its class context. zembed-1 can handle full code units.

Multilingual Code, Multilingual Queries

zembed-1 was trained with more than 50% non-English data, making it capable of retrieving code from natural language queries in any major language. A developer working in Japanese can search a Python codebase in Japanese and retrieve relevant results — without a translation layer.

Developer AI Use Cases

01

Code Search in IDEs and Developer Tools

Power semantic code search so developers can describe what they’re looking for in natural language and get back the most relevant functions, classes, and snippets from their codebase. zembed-1’s code benchmark performance ensures these results are actually the most relevant — not just the ones with matching keywords.

02

RAG-Powered Code Assistants

Build retrieval-augmented code assistants that fetch relevant internal code, documentation, and examples before generating suggestions. zembed-1 retrieves the right context — the correct API usage patterns, the right internal utility functions — so the generative model produces better output.

03

Documentation and API Search

Retrieve relevant API documentation, README content, and code comments in response to natural language questions. zembed-1 handles the hybrid text-and-code nature of technical documentation naturally.

04

Codebase Onboarding Tools

Build tools that help new developers find relevant examples, understand existing patterns, and locate the right code to extend or modify. zembed-1’s cross-domain capability means it handles the mix of code and prose documentation that most codebases contain.

05

Technical Debt and Refactoring Discovery

Search for all implementations of a particular pattern or anti-pattern across a large codebase. zembed-1’s semantic understanding catches conceptually similar implementations even when they use different naming conventions.

06

Code Review Assistance

Retrieve similar past code reviews, style guidelines, and relevant precedents for a given diff — giving reviewers context for their feedback and consistency in applying standards.

Getting Started

zembed-1 is available open-weight on HuggingFace (CC-BY-NC-4.0) and via the ZeroEntropy API for production deployments.

from sentence_transformers import SentenceTransformer

model = SentenceTransformer(
    "zeroentropy/zembed-1",
    trust_remote_code=True,
    model_kwargs={"torch_dtype": "bfloat16"},
)

# Code search: natural language query against code corpus
query_embeddings = model.encode_query(
    "Function that validates an email address using regex"
)

document_embeddings = model.encode_document([
    """def validate_email(email: str) -> bool:
    import re
    pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    return bool(re.match(pattern, email))""",

    """def check_user_input(data: dict) -> bool:
    # Validates user registration form
    if not data.get('username') or len(data['username']) < 3:
        return False
    return True""",
])

similarities = model.similarity(query_embeddings, document_embeddings)
# zembed-1 correctly ranks the email validation function highest
import ast
import textwrap

def extract_functions(source_code: str) -> list[dict]:
    """Parse Python source and extract individual functions with docstrings."""
    tree = ast.parse(source_code)
    functions = []
    for node in ast.walk(tree):
        if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
            docstring = ast.get_docstring(node) or ""
            body = textwrap.dedent(ast.unparse(node))
            functions.append({
                "name": node.name,
                "text": f"Function: {node.name}\n{docstring}\n{body}",
            })
    return functions

# Index your codebase function-by-function
all_functions = []
for filepath in get_python_files("./src"):  # your file walker
    with open(filepath) as f:
        all_functions.extend(extract_functions(f.read()))

func_texts = [fn["text"] for fn in all_functions]
func_embeddings = model.encode_document(func_texts, batch_size=32, show_progress_bar=True)

# Search at query time
def search_codebase(query: str, top_k: int = 5):
    q_emb = model.encode_query(query)
    scores = model.similarity(q_emb, func_embeddings)[0]
    top = scores.argsort(descending=True)[:top_k]
    return [(all_functions[i]["name"], float(scores[i])) for i in top]

results = search_codebase("retry logic with exponential backoff")
# Returns the most semantically relevant functions, even without matching keywords

The Verdict for Engineering Teams

If you’re building developer tooling — code search, AI coding assistants, documentation retrieval, or codebase intelligence platforms — zembed-1 gives you the best available code retrieval performance, period. Its top score on code benchmarks, combined with leadership across all other domains, means you don’t have to choose between a model that’s great for code and one that’s great for everything else.

With zembed-1, you get both.

Get Started

zembed-1 is available today through multiple deployment options:

from zeroentropy import ZeroEntropy
zclient = ZeroEntropy()
response = zclient.models.embed(
model="zembed-1",
input_type="query", # "query" or "document"
input="What is retrieval augmented generation?", # string or list[str]
dimensions=2560, # optional: must be one of [2560, 1280, 640, 320, 160, 80, 40]
encoding_format="float", # "float" or "base64"
latency="fast", # "fast" or "slow"; omit for auto
)

Documentation: docs.zeroentropy.dev

HuggingFace: huggingface.co/zeroentropy

Get in touch: Discord community or contact@zeroentropy.dev

Talk to us if you need a custom deployment, volume pricing, or want to see how zembed-1 + zerank-2 performs on your data.

Related Blogs

Catch all the latest releases and updates from ZeroEntropy.

ZeroEntropy
The best AI teams retrieve with ZeroEntropy
Follow us on
GitHubTwitterSlackLinkedInDiscord