Use case or problem
As vaults grow, the challenge shifts from storing knowledge to understanding relationships between notes. Obsidian currently excels at representing explicit links, but cannot illicit semantic relationships dynamically (shared meaning, themes, overlapping concepts).
Users and plugin developers are already trying to solve this, but in fragmented ways:
- Multiple plugins each generate their own embedding indexes → duplicated computation and inconsistent similarity scoring
- Some users manually paste notes into external AI tools → this leaks private vault content
- Local LLM / CLI workflows interact directly with
.mdthe vaults md files → but every tool must re-implement its own semantic reasoning
The absence of a shared local semantic foundation has become a practical limitation, not a philosophical advantage.
Proposed solution
Introduce an opt-in, local-only, privacy-preserving semantic indexing service in core:
-
Runs entirely offline
-
Uses a small local embedding model (e.g., ONNX / sentence-transformers)
-
Maintains a vector similarity index (SQLite +
sqlite-vss, LanceDB, etc.) -
Re-embeds only notes that change
-
Exposes a small, stable API (example):
ai.getRelatedNotes(notePath, { topK: 5 })
ai.searchContext(query, { topK: 10 })
ai.getEmbedding(notePath)
This is a foundational layer, not a UI feature and not an AI assistant.
Design principle:
- Core determines which notes are related.
- Plugins determine what to do with those relationships.
This enables the community plugin ecosystem to build meaning-based functionality rather than repeatedly reinventing embedding/storage logic.
Examples of plugin workflows this unlocks:
- Suggested backlinks and missing link discovery
- Semantic graph overlays (graph shows why notes relate)
- Concept / topic / theme clustering
- Knowledge synthesis and research insight tools
- Journaling reflection & idea evolution summaries
- Study and spaced-repetition tools guided by conceptual proximity
All while keeping:
- The vault as plain Markdown files
- All computation local
- Privacy fully intact
Optional extension
Expose the semantic index through a local MCP service (obsidian://mcp), so local LLMs and agent frameworks (e.g., Claude Desktop MCP, LM Studio, Ollama, Speckit) can reason over the vault without exporting notes.
Current workaround (optional)
- Existing AI and smart-linking plugins each maintain their own embedding cache
- Users manually copy/paste note text into cloud AI tools
- Developer scripts and CLIs already treat the vault as a dataset, but without any shared semantic index
These workarounds are:
- Redundant
- Inconsistent
- Often less private than having the semantic layer local and shared.