Local Semantic “Knowledge Indexer” (Opt-In Core Layer) + Optional MCP Service

Use case or problem

As vaults grow, the challenge shifts from storing knowledge to understanding relationships between notes. Obsidian currently excels at representing explicit links, but cannot illicit semantic relationships dynamically (shared meaning, themes, overlapping concepts).

Users and plugin developers are already trying to solve this, but in fragmented ways:

  • Multiple plugins each generate their own embedding indexes → duplicated computation and inconsistent similarity scoring
  • Some users manually paste notes into external AI tools → this leaks private vault content
  • Local LLM / CLI workflows interact directly with .md the vaults md files → but every tool must re-implement its own semantic reasoning

The absence of a shared local semantic foundation has become a practical limitation, not a philosophical advantage.


Proposed solution

Introduce an opt-in, local-only, privacy-preserving semantic indexing service in core:

  • Runs entirely offline

  • Uses a small local embedding model (e.g., ONNX / sentence-transformers)

  • Maintains a vector similarity index (SQLite + sqlite-vss, LanceDB, etc.)

  • Re-embeds only notes that change

  • Exposes a small, stable API (example):

    ai.getRelatedNotes(notePath, { topK: 5 })
    ai.searchContext(query, { topK: 10 })
    ai.getEmbedding(notePath)

This is a foundational layer, not a UI feature and not an AI assistant.

Design principle:

  • Core determines which notes are related.
  • Plugins determine what to do with those relationships.

This enables the community plugin ecosystem to build meaning-based functionality rather than repeatedly reinventing embedding/storage logic.

Examples of plugin workflows this unlocks:

  • Suggested backlinks and missing link discovery
  • Semantic graph overlays (graph shows why notes relate)
  • Concept / topic / theme clustering
  • Knowledge synthesis and research insight tools
  • Journaling reflection & idea evolution summaries
  • Study and spaced-repetition tools guided by conceptual proximity

All while keeping:

  • The vault as plain Markdown files
  • All computation local
  • Privacy fully intact

Optional extension

Expose the semantic index through a local MCP service (obsidian://mcp), so local LLMs and agent frameworks (e.g., Claude Desktop MCP, LM Studio, Ollama, Speckit) can reason over the vault without exporting notes.


Current workaround (optional)

  • Existing AI and smart-linking plugins each maintain their own embedding cache
  • Users manually copy/paste note text into cloud AI tools
  • Developer scripts and CLIs already treat the vault as a dataset, but without any shared semantic index

These workarounds are:

  • Redundant
  • Inconsistent
  • Often less private than having the semantic layer local and shared.