Before Obsidian, i used Docear to manage my annotations in PDF.
Docear can scan and retrieve highlights and annotation like zotfile. It also can remember the annotation position in the PDF file. However, it can retrieve annotation of multiple files at once, and do it faster than zotfile.
Currently, @argentum written mdnotes to bridge between obsidian and zotero. However, i think it would be more efficient to be able to retrieve annotation straight from PDF into obsidian. Each PDF will have their own md file. Each block linking with their respective PDF file. (basically zotfile for obsidian)
Docear is open source, and they uploaded all their code to their website.
Additional request: It would be even better if this plugin also link with zotero (as we still need zotero to cite in MS docx)
For the Keypoints app I’m currently developing (it’s not available yet), I’ve implemented something similar based on individual plaintext notes. Goals are to facilitate an academic reading workflow & knowledge management. See this thread for more info and a little screencast.
In the future, I hope to integrate this with knowledge management apps such as Obsidian. This would allow you to directly push your literature notes from Keypoints to Obsidian (ideally in a customizable template-based format).
I use an app called Highlights that automatically saves your pdf highlights in markdown. It’s not ideal in several respects (used to be quite buggy, which is much better now, but it’s not very customizable), but works really well for this kind of thing. Skim, which is open source afaik, also has some useful scripts for exporting annotations into md.
That said, I’ll definitely try Keypoints once it’s out :).
Hi there! I created a plugin in December that, like zotfile, takes an annotated PDF and extracts the text highlights into markdown.
Extracting PDF annotations is surprisingly, shockingly hard – much harder than I had expected. It took me weeks to get it working. Unfortunately, since Obsidian upgraded their internal PDF library a few weeks back, the plugin has to be completely re-written as well to match their version again.
Anyways, I don’t have time at the moment to do this full refactor, but the code and commits are there in case anyone wants to help out.
Hey akaalias would it be possible to extract the highlights as atomic quotes and reference them in a parent page?
It would be amazing if the atomic quotes or blocks would then also have the metadata included.
automatically generate empty page for the PDF (metadata of PDF saved in this file)
manually highlight text in pdf editor
highlights automatically extracted as atomic blocks which have the same metadata plus page reference
excerpts listed as block references under the generated “empty” page
Benefits:
I would have a page of highlights for every PDF
Metadata and would be given for every quote
Every quote could have extra comments in their metadata or on the summary page
I could link every block in the graph since they have their own page
Does that make sense?
I can imagine that this would be a lot of work