Mistral recently announced a SOTA OCR model that converts PDFs into markdown. It works pretty good, even cutting automatically the images. I wanted to be able to use this in Obsidian, so i changed a bit the codes they provide in their documentation to adapt specially the images to work with wikilinks, as by default it encoded the images directly in the markdown document, at that made my notes so slow.
I found it very useful for latex formulas, as before it was dificult, I was sending images of each page to ChatGPT and it was clunky.
Here is the repository: pdf-ocr-obsidian, where I put a python notebook you all can explore. I’m open to improvements, so you can suggest pull requests with any improvements. It would be great if this could work inside obsidian at some point, like the new web-browser plugin does with webpages, but with PDFs…
Here is an example of the results: