Looking for a reliable solution to find PDF content

ChrisG_HKG · October 1, 2025, 1:48pm

What I’m trying to do

Looking for a OCR solution to include PDF’s in the search

Things I have tried

I tried Omnisearch and Text Extractor but it does not work

Sunnaq445 · October 1, 2025, 10:31pm

I use Zotero to index PDFs and then I symlink to another folder and point CursorAI text editor at the folder (actually, one above). This way I can search with regex and do not strain my Obsidian vault.

ChrisG_HKG · October 2, 2025, 9:42am

Thank you for sharing your workflow. The symlink approach with Zotero and CursorAI is really interesting! For my case though, I’m mainly looking for a way to do “full-text PDF search directly inside Obsidian” without necessarily setting up a parallel structure. But I appreciate the idea. Thank you

Sunnaq445 · October 2, 2025, 10:42am

You can place the symlinked folder inside your Obsidian vault as well. But if you have many 5-10MB md files for large docs, it may hinder your ability to search for your own stuff in the long run.
The trick is to export BetterBibTex (json) from Zotero and have a Python script create md files from the txt files created by Zotero based on the citationKey’s.