Hybrid Search — hybrid search MCP server + CLI for AI assistants (BM25 + semantic, Obsidian-native)

Obsidian Hybrid Search — MCP server + CLI that combines BM25, fuzzy title matching, and vector embeddings into one ranked result set.

The problem

Most vault search tools make you choose between two failure modes:

  • Semantic-only (Smart Connections, similar-notes plugins)

    • great for conceptual queries, but unreliable when you’re looking for a specific term, a docker container name, or a person you documented months ago
  • BM25-only (Omnisearch)

    • fast and exact, but it can’t surface notes that are conceptually related without sharing vocabulary

When your vault is large enough, both approaches leave something on the table. You need exact matches and conceptual proximity — simultaneously.

What this does differently

Obsidian Hybrid Search runs three parallel retrieval paths:

  1. BM25 full-text (FTS5)

    • with weighted columns: title 10×, aliases 5×, content 1×
  2. Fuzzy trigram matching on titles and aliases

    • tolerant of typos
  3. Semantic vector search

    • cosine similarity over embeddings

All three are fused via Reciprocal Rank Fusion (RRF). A note that ranks well across multiple signals floats to the top — no single method dominates.

Optionally: cross-encoder reranking with bge-reranker-v2-m3 for higher precision on ambiguous queries.

Obsidian-specific additions

  • Aliases are indexed and searchable

    • alias matches are boosted in both BM25 and fuzzy scoring
  • Tag and folder filtering

    • --tag pkm

    • --scope notes/projects/

    • exclusions with -

  • Graph traversal

    • --path note.md --related

    • shows linked notes at configurable depth, with backlinks (-1/-2) and outgoing links (+1/+2) separated

  • Real-time incremental indexing

    • chokidar watches for file changes in the background (MCP mode)
  • --open flag

How it stores data

One SQLite file (.obsidian-hybrid-search.db) at your vault root. No external server, no Docker, no binary dependencies beyond Node. You can open it with any SQLite viewer.

Quick start (no API key needed)

Works fully offline using a bundled multilingual-e5-small model (~117 MB, 100+ languages).

CLI:

npm install -g obsidian-hybrid-search
cd /path/to/your/vault
obsidian-hybrid-search reindex
obsidian-hybrid-search "how to build a consistent daily review"

MCP (Claude Desktop / Claude Code / any MCP client):

{
  "mcpServers": {
    "obsidian-hybrid-search": {
      "command": "npx",
      "args": ["-y", "-p", "obsidian-hybrid-search@latest", "obsidian-hybrid-search-mcp"],
      "env": {
        "OBSIDIAN_VAULT_PATH": "/path/to/your/vault"
      }
    }
  }
}

For better embedding quality, swap in any OpenAI-compatible API (OpenRouter, Ollama, LM Studio) via OPENAI_BASE_URL + OPENAI_EMBEDDING_MODEL.

Once connected, the assistant queries your vault the same way it calls any other tool — ranked results with snippets, tags, links, and backlinks. No copy-pasting notes into context. Ask it to find everything you’ve written about spaced repetition and surface the contradictions — it will.

Comparison with QMD

The closest prior work is QMD. It’s a well-engineered tool, and I learned from reading its code and PRs. The differences:

  • QMD generates hypothetical documents (HyDE), dense retrieval sentences, and BM25 keywords before issuing any search — all via a fine-tuned 1.7B LLM. That’s three GGUF models loaded on every start (~2 GB+). Whether the quality improvement over a well-tuned hybrid retriever justifies this overhead in a typical PKM scenario is an open question.

  • QMD is a general markdown search engine, not an Obsidian tool. It has no concept of aliases, frontmatter tags, wikilink graphs, or vault-relative paths — the abstractions that make Obsidian vaults structurally different from a flat notes directory.

  • OHS is built specifically for Obsidian: alias indexing with BM25 boosting, tag/folder filtering, graph traversal, --open integration.

The result is a tool that fits how Obsidian vaults are actually structured — and stays out of the way when you just need a fast search.

Hello. Obsidian is trademarked name and you cannot use Obsidian as the name of your project

Developer policies - Developer Documentation.

Thanks for the note. Looking at the developer policies you linked, the trademark restriction applies to “community plugins and themes.” This project is neither — it’s an external MCP server and CLI tool that works independently of Obsidian’s plugin system. Happy to discuss if there’s a different policy that applies here.

The trademark applies everywhere, not just for plugins. I just linked to the plugin developer docs because they are the most frequent offenders.

Understood. I don’t seem to have the ability to edit the topic title — could you update it to “Hybrid Search MCP server + CLI for AI assistants (BM25 + semantic, RRF)”?

You should update the repo name, the npm package, etc, etc.

Thanks for updating the title. I’ll take a look at the repo and package naming.