PDF - Zotero - Obsidian: Current state and collaboration for the ONE plugin?

Hi all,

@EleanorKonik requested help in collating information about the many different plugins dealing with workflows from PDF to Obsidian (often via Zotero), and there has been a Discord thread on pooling resources and developing one plugin instead of maintaining and developing a multitude of similar but different plugins according to a (very wide variety) of preferences.

So, here is a start in in collating information on the current state of things with regards to how to get information out of a PDF and into Obsidian, with a focus on the path via Zotero. Since Zotero 6 has just gone live, a lot of workflows have broken, as it was a MAJOR release that effectively made Zotfile and MDnotes unusable (see @Cat wonderful workflow description here in the forum).

What follows is by necessity limited to my own experience and requirements, so it really is a working document that needs input from users who’ve used (developed) plugins and workflows that I have not (or am not even aware of. It’s supposed to be a central collection point for ideas, requirements, and questions - as well as gratitude to the folks who actually develop these amazing plugins!

so here goes…:

Notes on PDF Zotero Obsidian workflows

The plugins seem to all solve slightly different problems or the same problem in slightly different ways…but the basic premise is this:

A researcher reads a PDF and highlights passages. The reference data gets entered into a reference manager (e.g. Zotero). From here the researcher would like to:

  • extract the annotated passages and have them available in Obsidian for future use (e.g. quoting, processing, studying, etc)
  • have the reference data available to them (extract) to
    • create a note dedicated to the source
    • cite the source

This doesn’t seem like too much trouble, but the differences in approaching these tasks is multifaceted and influenced by:

  • the researchers’ level of technological proficiency
  • the discipline of the researcher
  • the level of academic rigour required/desired by the researcher (e.g. academic/non-academic/hobby researcher/student/etc)
  • the intended use of the references and annotations
  • the individual preference/style/habits of annotating or working with annotations
  • the individual online/offline needs of the individual
  • whether the goal is to finish most of the work in Obsidian, or get it ready enough for export into a ‘normal’ word processor for sharing with colleagues, etc
  • the interest in modification of the workflow (set-and-forget, or tweak-and-perfect)
  • and so on…

These can also collide in the one researcher, who may have particular needs for one project, but different ones for another (eg. research publication vs lecture). As these are endless, the main issue will be to keep it as simple as possible for the set-and-forget crowd, but offer the tweak-and-perfect people the option to modify, personalise, tweak (perhaps in ‘advanced settings’?). A key to all of this is CLEAR and SIMPLE documentation that assumes no knowledge on the part of the user with regards to settings in Zotero or templating in Obsidian.

Non-plugin or adjacent workflows:

  • mdnotes workflow by @Cat (mdnotes Zotero plugin by @argentum )
  • Alfred workflow by @pseudometa (there’s two I think)

Obsidian PDF Plugins

Note on PDF plugins:
This needs more details - perhaps users and developers could fill in what their benefits are?

Obsidian Zotero Plugins:

Bibnotes Formatter

This plugin works by exporting a bibtex json file (which can be set to be automatically update in Zotero), which can then be called upon from within Obsidian. The plugin is quite flexible and allows a lot of customisation.

  • pros
    • fast
    • works well with Zotero 6 (at least the beta)
    • once set-up, very easy to use
    • extremely customisable
    • access to all fields avaliable in Zotero
    • some automation
    • modification of annotations possible, but limited
  • cons
    • requires quite a bit of setting up and tweaking
    • complexity can overwhelm users - a visual walk-through of all the features and set-up would be great

Citations

It appears to be a simpler version of Bibnotes Formatter (but was available earlier) without the ability to import annotations into Obsidian. Looks great for getting Zotero metadata into Obsidian and citing references.

Zotero Desktop Connector

This one only partially works with Zotero 6, so my experience is limited. The best thing about it is the very easy set-up and inserting of citations (it currently doesn’t work with exporting the annotations, but this does seem to work with an older version of Zotero). So far, no customisation. If the ease of set-up was married with the customisability of Bibnotes, that would be awesome!

Obsidian Zotero Plugin

Seems in planning, I haven’t been able to try it yet.

How to pick a plugin for your workflow

Here’s a handy flowchart… :smirk:

Wishlist

In general, the main points seem to be easy customisation in the form of templates for the extracted metadata and annotations, including some automations (e.g. sort by annotation colour), and potentially larger automations (keep everything up-to-date without having to manually update a changed Zotero item).

See also:

As inspiration, I would add that ZoteroRoam does just about all of the things on the wish list, and more, including downloading and adding references parsed from the reference list inside the PDF(!!!). See here 2c. On-page menu - zoteroRoam I haven’t used it in some time, so who knows what other magic it can now do. I will say that the templates (‘funcmaps’) are extremely novice unfriendly.

Wish list (in plain English):

  • Insert a full reference in a particular reference style (e.g. Chicago 16th full note)
  • easy templating function that allows for multiple transformations (this seems to be a problem according to Stefano and Micha for Bibnotes, e.g. I can add a prefix, but not a newline after the prefix)
  • fine-tuning pre-pending and appending things (Zotero fields, or general text) to the extracted highlights (e.g. headers, page numbers, etc)
  • creating field types (not sure this is the right term): similar to assigning a colour to become a ‘keyword::’ in Bibnotes, it would be great to be able to use different colours to create things like ‘primary sources::’ or ‘related to::’ or ‘to read::’ where the highlighted text doesn’t end up in the annotations, but in the metadata section (just like keywords currently do in Bibnotes)
  • allowing for multiple templates in use
    • sometimes my research template doesn’t make sense for things I read for lectures, or just fun things I came across…so, commands like ‘extract to research template’ (assign a hotkey?) or ‘extract to lecture template’ might be an option
  • I personally like the idea of connecting to semantic scholar or scite.ai or researchrabbit a la ZoteroRoam…but it’s probably more on the ‘fun’ than the ‘necessary’ side…
  • add option to import annotations as individual notes (see PDF - Zotero - Obsidian: Current state and collaboration for the ONE plugin? - #4 by erazlogo
  • be able to add archive field from Zotero (see PDF - Zotero - Obsidian: Current state and collaboration for the ONE plugin? - #4 by erazlogo

Alright, over to you - discuss! :wink:

EDIT: Ps. mods, feel free to move this topic if you feel it would fit better elsewhere…

41 Likes

This plugin has not worked well for my needs, primarily because the extracted highlights have spaces removed between words. This is visible in the demo video the developer made, was raised in GitHub issues, and there has been no response or development in over a year.

1 Like

A major obstacle to this new version of Zotero integrating with Obsidian (or anything) is that they’ve added a PDF reader, which will benefit many users. The problem is that unlike every other PDF reader that I’ve ever seen, annotations are not stored in the PDF, but in the app’s database as metadata about the PDF. This is especially a problem for folks that want to have their Zotero items linked to the PDF files stored locally, rather than synced through Zotero’s paid service. And if you want to read PDF files with then new mobile Zotero, you must use their file sync service, as linked files aren’t supported.

It is possible to have Zotero write the annotations into the PDF file, but this must be done manually, has no keyboard shortcut, and cannot be done to multiple files at once.

It is possible in Zotero 6 to attach a note to an item (like a PDF) that contains the annotations (both the PDF standard annotations contained in the PDF file and the non-standard new Zotero 6 “annotations”), but if you do this with multiple items selected, it fails to operate on all of the items with no error message, and if you do it more than once, it just adds a new note, leaving the older note of extracted annotations. Bibnotes formatter does a great job of capturing these, but the problem is having to manually update the extracted annotation notes in Zotero one by one.

There doesn’t seem to be any way to use the new Zotero PDF reader to annotate and have those show up in Obsidian in any kind of automated manner.

7 Likes

One more use case: Option to export each annotation as a separate note with all source metadata (customizable) appended to each annotation. As I mentioned in Discord, archive and archiveLocation Zotero fields have be included for historians (they are included in JSON export but not in BetterBibTex export)

2 Likes

Good to know - it does seem that it is not in active development/support anymore.

Many good points about the difficulties with the new Zotero.

I will note, though, that there is no compulsion to use the new Zotero PDF Reader. You can choose any PDF reader you like on your system under settings>general>open PDFs with…

This should hopefully alleviate at least some of the syncing problems. I don’t ever do mass extractions of PDF annotations, so have not encountered any issues there. I usually extract annotations when I am done with reading the PDF, but as noted above, there are many different workflows, and I appreciate that Zotero 6 makes some of the ones we are used to much more complex.

Perhaps it would be useful to have a dedicated forum post on “how to deal with changes to Zotero 6”? Or “What are the implications of upgrading to Zotero 6”? Or something along those lines to deal with the immediate problems that are happening due to the automatic upgrade breaking lots of people’s workflow?

Thanks, noted for the “Wishlist”!

1 Like

Thanks for this great list. I do hope that plugins like mdnotes, or something new to replace it, will get a zotero 6 version soon. I did notice that zotfile does seem to still at least work for sending files to tablets, but that does not alleviate the extract annotations issues. I had seen in forums over on zotero’s site that the zotfile people did not want to create something that zotero would natively support and now that zotero has the built in extractor, hopefully that gets better (being able to define your own template like parameters using config editor was something I relied heavily on to prep my notes in zotero from my pdfs before bringing them to obsidian).

I think, as @knewman noted, the pdf reader, while helpful (I do like that there is an iPad app that works between my devices is nice, the fact that it does not actually add the annotations to the pdf file is strange!

I am looking forward to seeing what other people say here! Oy, if only this change did not happen while I am in the middle of my PhD exams and now need to completely redefine my entire workflow!

1 Like

thanks for your reply!

Perhaps you can port some (all?) of your config templates to the new Annotations templates, or split the duty between the Zotero templates and Bibnotes templates (which is what I am doing).

And as noted before, you can still use any old PDF reader, and still have Zotero extract those annotations for you.

Good luck with your PhD exams - it’s always frustrating when technology tackles you sideways when you least need it… :wink:

Many thanks, @Kabo, for this compilation and the wish list. This also helps me trying to develop an independent third-party solution which has many similar goals. There are many individual efforts happening redundantly…

Ideally, we’d have a common (i.e. widely accepted & usable) exchange format for annotation and reference data. The latter has at least some (not really standardized but common) formats (like BibTeX, or even better, Citeproc/CSL JSON).

Then there could be a tool solely dedicated to converting these data into desired output formats using a common template syntax. There could be a public repository of templates where templates could be shared easily.

When I started my own endeavours years ago, I tried to get people on board to agree on a common plaintext/Markdown format for annotation notes. That never worked out since, as you noted above, individual needs are way to different. But common data formats together with a common templating syntax may be (slightly) more realistic.

5 Likes

Thanks!

Where can you find the Zotero templates? I think I’ll do that too. I have been using Bibnotes to get details on the texts into obsidian and then zotfile to extract annotations and add labels based on color coding and then export them using MDnotes.

@jadoff For the new Zotero 6 PDF Reader, see the note templates documentation.

2 Likes

As far as I am concerned, the only major missing piece is automating the export of highlighting done in the Zotero PDF reader, which presumably requires a Zotero plugin if the Zotero devs aren’t going to do it themselves. Better onboarding for the Obsidian side of the equation certainly wouldn’t hurt, but in terms of functionality I’m not sure what there is to gain by trying to create One Plugin to Rule Them All.

Conversely, some of the things on the wish list are already handled by non-Obsidian software and trying to bring them within Obsidian seems pointless to me. As an example, inserting a full reference in a particular reference style—Pandoc already does this, and it’s the kind of task I’d expect only to be done when exporting something to share with others anyway, probably in a format other than Markdown. If one really wants it in the Obsidian version, Zotero’s quick copy function also does this.

More generally, I would be disinclined to use a plugin with a surplus of features like the noted features in ZoteroRoam. First, it seems antithetical to the idea of not being locked into Obsidian. I don’t want to have to recreate every aspect of my workflow if, for whatever reason, I stop using Obsidian, and I don’t want to feel locked into Obsidian because I am so dependent on it for my entire workflow. Second, we’re talking about plugins rather than the Obsidian core, and plugins function at the whim and the mercy of the Obsidian devs. The more a plugin tries to do, the more fragile it becomes when there are Obsidian updates. Third (really kind of a 2b), I worry about an effort that seems to verge on recreating a reference manager as a plugin to software that is in no way designed to be a reference manager.

Ideally, I want an Obsidian plugin to ingest citations and annotations generated elsewhere for use in taking notes on those sources. The wish list items around refining templates/customization and improving documentation seem great to me. I’m leery of an Obsidian plugin that’s doing much more than that.

6 Likes

I love this thread and look forward to seeing some of these various workflows merged and simplified in a way that makes then more immediately accessible!

That said, I was shocked and amazed at how awesome the new Zotero 6 is. I didn’t even know it was coming, but have been using 5.0.xx a fair bit over the past few weeks. Two days ago, I double-clicked on a PDF expecting it to open in a browser, and up it popped in a Zotero tab. With full highlighting and annotation tools. Hmmm, I don’t remember this doing that yesterday… quickly confirmed that yes, indeed, it had updated itself to 6.0.

Created a bunch of highlights, selected “Add note from Annotations” and boom, all my highlights are pulled. Selected the note, right-clicked “Export to:” markdown, named it, dropped it in my vault. All the notes in Obsidian now point back to the source paper in Zotero with two links, one to the record of the paper in the Collection browser, and a second which opens the PDF to exactly the highlight. This is awesome!

Could the day get better? Yes! Zotero for iOS released (finally)! Sync is perfect. Highlight a passage while reading on my iPad and the highlight shows up on my laptop ~20 seconds later.

I suppose the live updating of the exported annotations note in Obsidian if I change something in the source paper is the only thing I haven’t tackled yet, but I think Better BibTex is supposed to handle that. I’ll save that for tomorrow…

3 Likes

Just a couple of details to clarify:

  1. As far as I know, the developer of Zotfile has not updated the plugin for some time. It is the Zotero developers themselves who have been keeping it “alive” for a few years now. For the same reason and to avoid problems and incompatibility issues when extracting annotations with Zotero 6, they deactivated that part of the plugin.

  2. I personally appreciate that the annotations are not added directly to the PDF file, but I recognize that there are other people for whom this is strange. I think that’s why the options of “save” and “extract” the annotations in/from the PDF are for. With one click you can switch from one system to the other.

1 Like

Thanks! I did not know that Zotfile has been maintained by Zotero and not the developer of Zotfile. I saw a conversation between multiple people on the github for Zotfile. Either way, good to know.

It makes sense that they deactivated that part of the plugin, I just wish that the functionality was a bit more equal between the two. But hopefully things will continue as time goes on, it did just leave beta this past week.

And I am still unsure how I feel about the annotations not being in the PDF themselves versus being in them. There are some ways that this is great, and others that are a bit challenging. I do know in terms of sharing PDFs for a reading with a classmate or student, it does help a lot because I no longer need to delete my annotations before sharing. I do like the export annotations to file option, I wish there was a way to do this for multiple files at once.

Great summary OP! Here are some comments.

ZoteroRoam is awesome and the dev is wonderful and responsive. ZoteroRoam currently can import Zotero item Notes directly into Roam and dev is working on getting this to work for Z6. Anyone interested in figuring out a way to do something similar with an Obsidian plug-in should contact them (@AlixLahuec on Discord, seen them on the Academia Roamana server). The plug-in now can also do TWO-WAY sync, meaning it can list sources that are cited by a paper and that reference a paper and ADD them to your Zotero library from within Roam, and then create a Roam page with the citekey, and download metadata and notes and add them to the page. This plug-in is what is keeping me in Roam for the time being despite wanting to leave for a while.

Options for searching for Zotero items and putting their citekeys into any text file are: Zotpick (Mac) and Zotero Citation Picker (Win).

It surprises me that the OP doesn’t list Pandoc conversion/output as a key step. If people are going to write actual drafts in Obsidian it seems that having a seamless Pandoc output to HTML and MS Word would be a priority for plug-in development. Zettlr, which is very similar to Obsidian, has this and I actually use it for final output for this reason.

Do people have opinions on whether it is better to keep and annotate pdfs in Zotero or import them and annotate them in Obsidian?

Another workflow option I have considered is working with pdf text rather than the pdf itself. A plug-in could potentially import pdf full-text (where applicable) into Obsidian, perhaps using pdftotxt (which is what Zotero uses for building its index). If it is customizable and works well, then it might actually work better for me than annotating the pdf, since text files are so much easier to work with in general. Rather than highlighting and commenting on the pdf, I just import the whole paper formatted as block quotes and annotate/comment the full text as markdown. I know this has limitations for articles with lots of images and figures, but maybe there are workarounds for those cases.

2 Likes

The reason that some people care a lot about this is that they want to be able to annotate PDFs across systems and platforms, including mobile, and this will require being able to do it outside Zotero. Zotero isn’t going to have non-iOS mobile for a long time, I suspect. The save and extract features don’t work smoothly with Zotero (e.g. it would create duplicates with every annotation, and since Zotero annotations aren’t in the pdf, they can’t be read by other programs). Zotero devs don’t seem to care enough about this issue to accommodate non-iOS mobile users or people who just want to be able to annotate with programs other than Zotero.

1 Like

good point about the danger of ‘feature bloat’! the One Plugin to Rule Them All was very much tongue in cheek and had more to do with the various points @msteffens makes, in particular

Since this is directly about one of my favourite things:

…let me explain. I would like to have a full citation imported with the metadata section so that I can quickly copy it without having to go back to Zotero or another program to get it. Not in the sense of a reference manager, but because I might be composing an email and want to share the full details of the reference, or as part of a lecture draft. Having to go through another program for this feels tedious, when it seems to me that if I can extract author, title, etc to Obsidian without problems, why not also the full, styled reference? It’s really just for convenience :slight_smile:

hope that clarifies the use of that, but I think it also highlights the difficulties - we all want slightly different things :smiley:

1 Like

Oh I would LOVE seamless output to Word! I have badgered and begged a variety of people on this point! As I see it, the lack of easy integration with the world’s most used word processor is one of the major, if not THE, obstacle to wider adoption of Roam and Obsidian and the like…

however, I don’t think that’s the job of a plugin for ‘how to get your annotations and references into Obsidian’ - as it’s basically at the other end of that (how to get stuff out of Obsidian).

Pandoc is a book with 7 seals (in other words, a mystery) to me and not non-technically inclined user friendly.

Have a look at this plugin, which imports the text of the PDFs (and processes them to do magic things), and I believe it has recently been updated to also import some metadata

Really great to have so many thoughtful responses and constructive issues raised!