PDF - Zotero - Obsidian: Current state and collaboration for the ONE plugin?

Hi all,

@EleanorKonik requested help in collating information about the many different plugins dealing with workflows from PDF to Obsidian (often via Zotero), and there has been a Discord thread on pooling resources and developing one plugin instead of maintaining and developing a multitude of similar but different plugins according to a (very wide variety) of preferences.

So, here is a start in in collating information on the current state of things with regards to how to get information out of a PDF and into Obsidian, with a focus on the path via Zotero. Since Zotero 6 has just gone live, a lot of workflows have broken, as it was a MAJOR release that effectively made Zotfile and MDnotes unusable (see @Cat wonderful workflow description here in the forum).

What follows is by necessity limited to my own experience and requirements, so it really is a working document that needs input from users who’ve used (developed) plugins and workflows that I have not (or am not even aware of. It’s supposed to be a central collection point for ideas, requirements, and questions - as well as gratitude to the folks who actually develop these amazing plugins!

so here goes…:

Notes on PDF Zotero Obsidian workflows

The plugins seem to all solve slightly different problems or the same problem in slightly different ways…but the basic premise is this:

A researcher reads a PDF and highlights passages. The reference data gets entered into a reference manager (e.g. Zotero). From here the researcher would like to:

  • extract the annotated passages and have them available in Obsidian for future use (e.g. quoting, processing, studying, etc)
  • have the reference data available to them (extract) to
    • create a note dedicated to the source
    • cite the source

This doesn’t seem like too much trouble, but the differences in approaching these tasks is multifaceted and influenced by:

  • the researchers’ level of technological proficiency
  • the discipline of the researcher
  • the level of academic rigour required/desired by the researcher (e.g. academic/non-academic/hobby researcher/student/etc)
  • the intended use of the references and annotations
  • the individual preference/style/habits of annotating or working with annotations
  • the individual online/offline needs of the individual
  • whether the goal is to finish most of the work in Obsidian, or get it ready enough for export into a ‘normal’ word processor for sharing with colleagues, etc
  • the interest in modification of the workflow (set-and-forget, or tweak-and-perfect)
  • and so on…

These can also collide in the one researcher, who may have particular needs for one project, but different ones for another (eg. research publication vs lecture). As these are endless, the main issue will be to keep it as simple as possible for the set-and-forget crowd, but offer the tweak-and-perfect people the option to modify, personalise, tweak (perhaps in ‘advanced settings’?). A key to all of this is CLEAR and SIMPLE documentation that assumes no knowledge on the part of the user with regards to settings in Zotero or templating in Obsidian.

Non-plugin or adjacent workflows:

Obsidian PDF Plugins

Note on PDF plugins:
This needs more details - perhaps users and developers could fill in what their benefits are?

Obsidian Zotero Plugins:

Bibnotes Formatter

This plugin works by exporting a bibtex json file (which can be set to be automatically update in Zotero), which can then be called upon from within Obsidian. The plugin is quite flexible and allows a lot of customisation.

  • pros
    • fast
    • works well with Zotero 6 (at least the beta)
    • once set-up, very easy to use
    • extremely customisable
    • access to all fields avaliable in Zotero
    • some automation
    • modification of annotations possible, but limited
  • cons
    • requires quite a bit of setting up and tweaking
    • complexity can overwhelm users - a visual walk-through of all the features and set-up would be great

Citations

It appears to be a simpler version of Bibnotes Formatter (but was available earlier) without the ability to import annotations into Obsidian. Looks great for getting Zotero metadata into Obsidian and citing references.

Zotero Desktop Connector

This one only partially works with Zotero 6, so my experience is limited. The best thing about it is the very easy set-up and inserting of citations (it currently doesn’t work with exporting the annotations, but this does seem to work with an older version of Zotero). So far, no customisation. If the ease of set-up was married with the customisability of Bibnotes, that would be awesome!

Obsidian Zotero Plugin

Seems in planning, I haven’t been able to try it yet.

How to pick a plugin for your workflow

Here’s a handy flowchart… :smirk:

Wishlist

In general, the main points seem to be easy customisation in the form of templates for the extracted metadata and annotations, including some automations (e.g. sort by annotation colour), and potentially larger automations (keep everything up-to-date without having to manually update a changed Zotero item).

See also:

As inspiration, I would add that ZoteroRoam does just about all of the things on the wish list, and more, including downloading and adding references parsed from the reference list inside the PDF(!!!). See here 2c. On-page menu - zoteroRoam I haven’t used it in some time, so who knows what other magic it can now do. I will say that the templates (‘funcmaps’) are extremely novice unfriendly.

Wish list (in plain English):

  • Insert a full reference in a particular reference style (e.g. Chicago 16th full note)
  • easy templating function that allows for multiple transformations (this seems to be a problem according to Stefano and Micha for Bibnotes, e.g. I can add a prefix, but not a newline after the prefix)
  • fine-tuning pre-pending and appending things (Zotero fields, or general text) to the extracted highlights (e.g. headers, page numbers, etc)
  • creating field types (not sure this is the right term): similar to assigning a colour to become a ‘keyword::’ in Bibnotes, it would be great to be able to use different colours to create things like ‘primary sources::’ or ‘related to::’ or ‘to read::’ where the highlighted text doesn’t end up in the annotations, but in the metadata section (just like keywords currently do in Bibnotes)
  • allowing for multiple templates in use
    • sometimes my research template doesn’t make sense for things I read for lectures, or just fun things I came across…so, commands like ‘extract to research template’ (assign a hotkey?) or ‘extract to lecture template’ might be an option
  • I personally like the idea of connecting to semantic scholar or scite.ai or researchrabbit a la ZoteroRoam…but it’s probably more on the ‘fun’ than the ‘necessary’ side…
  • add option to import annotations as individual notes (see PDF - Zotero - Obsidian: Current state and collaboration for the ONE plugin? - #4 by erazlogo
  • be able to add archive field from Zotero (see PDF - Zotero - Obsidian: Current state and collaboration for the ONE plugin? - #4 by erazlogo

Alright, over to you - discuss! :wink:

EDIT: Ps. mods, feel free to move this topic if you feel it would fit better elsewhere…

41 Likes

This plugin has not worked well for my needs, primarily because the extracted highlights have spaces removed between words. This is visible in the demo video the developer made, was raised in GitHub issues, and there has been no response or development in over a year.

1 Like

A major obstacle to this new version of Zotero integrating with Obsidian (or anything) is that they’ve added a PDF reader, which will benefit many users. The problem is that unlike every other PDF reader that I’ve ever seen, annotations are not stored in the PDF, but in the app’s database as metadata about the PDF. This is especially a problem for folks that want to have their Zotero items linked to the PDF files stored locally, rather than synced through Zotero’s paid service. And if you want to read PDF files with then new mobile Zotero, you must use their file sync service, as linked files aren’t supported.

It is possible to have Zotero write the annotations into the PDF file, but this must be done manually, has no keyboard shortcut, and cannot be done to multiple files at once.

It is possible in Zotero 6 to attach a note to an item (like a PDF) that contains the annotations (both the PDF standard annotations contained in the PDF file and the non-standard new Zotero 6 “annotations”), but if you do this with multiple items selected, it fails to operate on all of the items with no error message, and if you do it more than once, it just adds a new note, leaving the older note of extracted annotations. Bibnotes formatter does a great job of capturing these, but the problem is having to manually update the extracted annotation notes in Zotero one by one.

There doesn’t seem to be any way to use the new Zotero PDF reader to annotate and have those show up in Obsidian in any kind of automated manner.

7 Likes

Many thanks, @Kabo, for this compilation and the wish list. This also helps me trying to develop an independent third-party solution which has many similar goals. There are many individual efforts happening redundantly…

Ideally, we’d have a common (i.e. widely accepted & usable) exchange format for annotation and reference data. The latter has at least some (not really standardized but common) formats (like BibTeX, or even better, Citeproc/CSL JSON).

Then there could be a tool solely dedicated to converting these data into desired output formats using a common template syntax. There could be a public repository of templates where templates could be shared easily.

When I started my own endeavours years ago, I tried to get people on board to agree on a common plaintext/Markdown format for annotation notes. That never worked out since, as you noted above, individual needs are way to different. But common data formats together with a common templating syntax may be (slightly) more realistic.

5 Likes

Oh I would LOVE seamless output to Word! I have badgered and begged a variety of people on this point! As I see it, the lack of easy integration with the world’s most used word processor is one of the major, if not THE, obstacle to wider adoption of Roam and Obsidian and the like…

however, I don’t think that’s the job of a plugin for ‘how to get your annotations and references into Obsidian’ - as it’s basically at the other end of that (how to get stuff out of Obsidian).

Pandoc is a book with 7 seals (in other words, a mystery) to me and not non-technically inclined user friendly.

Have a look at this plugin, which imports the text of the PDFs (and processes them to do magic things), and I believe it has recently been updated to also import some metadata

Really great to have so many thoughtful responses and constructive issues raised!

Oh, I didn’t realize that the conversation was aimed only at getting things into Obsidian or any particular plug-in. I was thinking about the more general issue of “can I use Zotero and Obsidian to write academic articles and output them to Word?”

I also find Pandoc too technical and I don’t use command-line stuff. I currently use Zettlr for Pandoc conversion from Markdown to HTML or Word because it has Pandoc built in.

2 Likes

This is my current capture workflow for PDF into Obsidian through Zotero.

For a non-research workflow where I just want to replicate my standard media->instapaper->readwise->obsidian workflow, this is needlessly complex.

5 Likes

This once occurred to me as well. The reason for me was that I used ZotFile to manage PDF attachments, instead of the Zotero data folder. This means that those files were stored outside the Zotero data folder, which is not supported for syncing PDF files to other devices yet, whether the Zotero official service or WebDAV is used to sync those files.

To convert linked files that are managed by ZotFile, one can use the following method:
Tools > Manage Attachments > Convert Linked Files to Stored Files...

After this, WebDAV can be used to sync PDF files to other devices, e.g., an iOS device.

2 Likes

There is a way to add the right click menu item back via another Zotero plugin called Zutilo, and I believe there will soon be an update with Mdnotes, or the workflow description… (@QuarkQuartet )

I just came across this topic and am very thankful for that. I’m an Obsidian newbie, so I may have some things wrong. But here are some random observations:

  1. I’m a great believer in the Unix principle that a tool should do one thing, and only one thing, well.

  2. With this in mind, IMHO this discussion conflates too many functions. The most obvious to me is (1) extracting metadata from a Zotero item to create a “source note” in Obsidian versus (2) extracting annotations from a pdf attached to a Zotero item.

  3. The discussion of extracting annotations glosses over the wide variety of ways people annotate pdfs.

  4. One example is a pdf that began life as a physical book or printed journal article and then was marked up with pen, pencil, and highlighter. At a later date, the printed & annotated document was scanned and converted into a pdf. On this electronic document, annotations are intelligible as such to humans but not necessarily to software that expects annotations to have a distinct digital format.

  5. Another example is how some software can only handle some kinds of annotations, even when they are all done electronically. For example, for many years I have drawn straight lines in the margin, parallel to the page’s left & right edges, to identify important points in the document; I use one line for passages that are somewhat important and five parallel lines for extremely important passages. And I do this all on my iPad. Yet I’ve not found anything that can extract the passage as text and the lines as markup. For example, Matthew Meyer’s Obsidian-Zotero-Integration plugin extracts only: highlights, underlines, strikethroughs, notes, and rectangles. In turn, this limited set results from using an external pdf utility.

  6. Since not all annotation styles are alike, any comparison of different software or workflows should include a list of exactly what kinds of annotations and annotation styles are supported.

  7. Cobblepot mentions pandoc, and several other posters mentioned converting Obsidian documents to Word. Again, this presumes what, according to the “do 1 thing well” principle, should not be presumed. For example, I do lots of technical work, and therefore use LaTeX for writing. (Often with the LyX front end.) In this workflow, it’s actually a feature, not a bug, to use an intermediate .bib file rather than import directly from Zotero to Obsidian.

  8. Closely related to this is the issue of the size of the Zotero library. I currently have over 17,000 items in mine. Sure, it’s a convenience to be able to access the entire library, but not at the cost of degrading performance. If I’m writing a paper, I may use only 100 sources to write it, with the final draft having perhaps 50; even a book will typically have at most a few hundred sources. Hence, if one uses a project-specific collection in Zotero along with the Bib(La)TeX plugin, instead of making an Obsidian import plugin search the entire library, one can restrict its searches to the project’s collection.

  9. My workflow is typically (a) initialize a Zotero bibliographic entry, (b) attach a searchable pdf to it, (c) use Zotfile to store the document on my NAS drive (mounted via WebDAV), (d) use PDF Expert on my iPad to download the document from the NAS drive, (e) read, annotate, and markup the document on the iPad, (f) change the name of the annotated document by adding “(marked)” to its end, (g) upload the marked document back to the NAS drive in the same folder as the original, and (h) use Zotfile to link to this second version of the document. This allows me to keep clean and marked copies of the original document, in case, e.g., I want to use one for a class or handout.

  10. This system works quite well, and I see no advantage to using a device that does not allow drawing directly on the screen. If the NAS drive sits behind a firewall and can be accessed over the Internet, the whole issue of synchronization becomes moot. Of course, if a team is working on the project, things become more complicated. But then, isn’t this what git is for?

5 Likes

This thread is great, but, so far, it doesn’t really deliver on the second part of its title: “collaboration for the ONE plugin?” I guess that answers the question mark after “ONE plugin”… it is extremely difficult to build the one plugin that most people will be happy with. And there is nothing wrong with having many plugins around.

What makes the current situation somewhat confusing for anyone trying to figure out their own workflow is that

  • there are two apps involved (obsidian and zotero), each with their own plugins and features, so that it is not the easiest task to wrap your head around which tasks you should best do with which app/ plugin.
  • Zotero 6 has rendered many workflows documented on the net obsolete and it is not always evident which ones. It’s probably a good idea to treat any tutorial that doesn’t mention zotero 6 as obsolete (though, technically this is not necessarily true).
  • as others have mentioned, we are talking about two different (but related) tasks: using zotero references in obsidian and bringing annotations into obsidian. (My own preliminary solution for combining the citations plugin for references with the mdnotes plugin for annotations became obsolete before I event got down to trying it out)

I am not sure what exactly the intention with the (yet unfinished) Obsidian Zotero Plugin is but since it seems to include both a plugin for Zotero and one for Obsidian, it looks like a promising project because maintaining both ends of the bridge in the same project probably makes things a lot easier for the user.

But what about the Zotero Integration plugin? Why hasn’t anybody mentioned it here? It has less than half the user-base of the citations plugin, but it is still significant:

Here are some other topics that mentions it:

Another plugin that might be worth mentioning here is Zotero-markDB-connect, which - in zotero - provides links to notes in obsidian.

The final thing I want to mention is a feature of Zotero itself that is easily forgotten when we talk about various plugins: Quick export. You can use it to simply drag and drop an annotation from the Zotero pdf reader into Obsidian and it will give you the annotation text + comments together with a reference to the source. If you hold shift while dragging and dropping, the reference will include a (deep-)link to the page of the pdf which, when clicked will open the annotation for you in zotero.

When I found out about this gen, my first thought was: do I still need any plugin now? Answer: yes, for references, but probably not for annotations. - It depends on how you work with annotations. If you simply want all annotations/highlights in obsidian, then a plugin is the way to go. But I’m trying to get away from hoarding highlights that I’ll never look at again, and instead distill from those highlights what I really want to keep in obsidian. And for that, dragging and dropping notes into obsidian seems like the best way forward.

Zotero is, then, a repository for notes and annotations that I made while reading and I can bring them over into obsidian once I actually want to do something with them.

7 Likes

I would like to have a full citation imported with the metadata section so that I can quickly copy it without having to go back to Zotero or another program to get it. Not in the sense of a reference manager, but because I might be composing an email and want to share the full details of the reference, or as part of a lecture draft.

I recently noticed the Pandoc Reference List plugin which is really great for having full citations at hand inside Obsidian while working with much shorter citations and links inside the markdown documents themselves.
It does rely on pandoc in the background, and kind of creates a bibliography in a separate sidebar pane similar to the local graph of a note.

As long as you do not need to regularly use different reference styles, this plugin makes it very easy to share references directly from Obsidian :slight_smile:

1 Like

Update on the obsidian-zotero project: the first public beta of obsidian-zotero-plugin is available for testing! The document isn’t complete yet, but you can find the installation guide here if you want to try it out: Installation | Obsidian Zotero

The plugin is still in beta, so please make proper backup for your data before and when using this plugin. If you have any questions or suggestions, please feel free to open a discussion or issue on GitHub: GitHub - aidenlx/obsidian-zotero: Obsidian.md integrates with Zotero, create literature notes and insert citations from a Zotero library.

Main features

  • :zap: Performance
    • Read data directly from Zotero database, no need to export data in text-based format
    • Fast fuzzy-search for literatures right within Obsidian
  • :unlock: Full access to Zotero data
    • All data are available in Obsidian, including annotations, notes, tags, and attachments.
    • Not restricted by Zotero API or Better BibTex.
  • :hammer: Highly customizable templates
    • Write your own templates to generate literature note and import annotation in any format you want.
    • Powered by Eta, write JavaScript code in your templates to handle complex transformations.
    • View all data in a literature in a structured way by using DetailsView.
  • :mag: Annotation View
    • View annotaions within Obsidian, side-by-side with the literature note.
    • Drag annotation to the literature note to import it.
    • Always up-to-date, auto-sync whenever you changes the annotation in Zotero.
  • :pencil: Annotation import
    • Import image and text annotation from Zotero to Obsidian.
    • Keep your imported annotation up-to-date with Zotero.
  • :writing_hand: Create literature notes with ease
    • Quick switcher to create literature note and insert citekey from any literatures in your Zotero library.
    • Open literature note in Obsidian from Zotero item page. (not yet implemented)
16 Likes