Built-in Library for PDF en epub workflow (read, highlight, annotate, extract highlights & annotations to a note, synchronized)

Use case or problem

I’d love to have a ==built-in PDF and epub workflow==, it’s essential in my research and creative work, since I have to go through many books, papers, articles, magazines…
With this I mean:

  • being able to read PDFs and epub files in Obsidian —uploaded to the vault and also, when preferred, embedded from external URL/source: e.g. Google Drive, Calibre, website… to not increase so much the vault storage, otherwise it would be huge.
  • whether it’s uploaded directly to the vault or embedded, being able to highlight and annotate in the PDF/epub file in Obsidian, and being able to extract those highlights and annotations to the note (1 note per file) that updates itself with each new highlight/annotation. Each highlight/annotation with its corresponding position. Each position/highlight/annotation clickable leading to the position in the actual file (optional).
  • It would be awesome to have these highlights and annotations synchronized in the case of embedded PDF/epub file, so if I continue to read the file in the Google Drive app, or Calibre app and keep highlighting/annotating there, it gets updated in the Obsidian file where it’s embedded (because it is the same file) AND it’s corresponding highlights/annotations in the note get updated too.
  • It would be extra awesome to include Kindle device as another external source, so whether it’s a purchased book or an uploaded PDF/epub file that I’m reading there, being able to have synchronized highlights and annotations for each file, with its positions/pages (optional).

This way I get to have all my workflow in Obsidian, entirely. And in the exceptional case that I had to continue highlighting or annotating in the external source (Drive, Calibre…) for some time, it get’s updated, so it’s not time/work wasted. ==The main reason for this feature is to have it all in Obsidian, with the infinite benefits that this have,== starting with the obvious one that having it all in one place is at least liberating. Other benefits: powerful, effective and valuable workflow, being able to process and consult a larger database that keeps feeding itself, work with this data in many ways, deep learning, quality research, quality and deep writing, serendipity, value creation, productivity, focus, innovation, deep and exponential thinking, systematization, processes, automation…

Proposed solution

Creating a ==built-in Library== where you can upload PDFs and epubs files directly, AND also where you can embed PDFs and epub files via URL when you don’t want to upload them so the vault don’t get huge. Then, you can read, highlight and annotate these files.

LIBRARY

The Library is organized by Sources (files), Authors and Highlights (Annotations are displayed under its corresponding Highlights when they exist).

When you click on Library, all Sources, Authors and Highlights (with Annotations when they exist) will be listed (like a dataview list that you can consult). Also an option “Add Source” will appear to create a new one.
Each source is listed showing: type (book…), author, title, number of highlights. (Metadata explained below).
Each author is listed showing: name, titles, number of highlights. (Metadata explained below).
Each highlight (and annotation) is listed showing: title. (Metadata explained below).

In Library’s view:

When you click on Sources, you go to a Sources Note, where all Sources will be listed with its cover, title, author, type and number of highlights. If you click on one of these Sources you go to its individual note “Source Note” (explained below: SOURCE Note - individual).
Also here you have the option “Add Source”.

When you click on Authors, you go to an Authors Note, where all Authors will be listed with its sources titles and number of highlights. There you can “Add New Author”.
If you click on one of these Authors you go to its individual note “Author Note” (explained below: AUTHOR Note - individual)

When you click on Highlights, you go to a Highlights Note, where all Highlights (and its Annotations when they exist) will be listed under its corresponding Source or Author. You can group them by Source or by Author, you get to choose clicking somewhere.

SOURCE Note - individual

Each Source is a Note and has its metadata (it can be customized or keep the default):

  • title:
  • author/s:
  • type: book, paper, article
  • published: year
  • cover: (image cover)
  • URL:
  • file: PDF, epub…
    Metadata is clickable.
    For example:
    Clickable author go to Author Note (explained below: AUTHOR Note - individual)
    Clickable URL and file can lead to the previewed file in a “Reader feature” where you can read, highlight and annotate.

If it’s better in regards to the code, an alternative can be that the previewed file can be displayed first on the note below the metadata, where you can read, highlight and annotate.

Each Highlight and Annotation appear in real time, in the order they appear in the file, below the previewed PDF/epub file —if it’s first on the note— or first on the note —if previewed file appear in the “Reader feature” when clicking on URL or file.
Annotations appear under its corresponding Highlight, like a sub-item or indented item.
Each Highlight in the note has a menu: you can edit it (some files contains mistakes), copy it, annotate it, delete it, find it in the file (through its position/page). In the Settings of this built-in feature you can choose if you want the position/page to be displayed in the listed Highlights or not, also if you want the file title to be displayed or not. This latter is useful when you have the All Highlights (and Annotations linked) listed under Library, or All Highlights (and Annotations linked) listed under Author.

In this note you can also:

  • add Highlight and and Annotation” manually, by typing it.
  • edit or delete the Source (this note you are in).

If you copy one of these Highlights (and Annotations) and then paste it on any note, it’s pasted as a blockquote with its author and title.
For example:

While Freud contributed important insights regarding the psyche, many of which are compatible with IFS, his drive theory was highly influential and pessimistic about human nature. It asserted that beneath the mind’s surface lies selfish, aggressive, and pleasure-seeking instinctual forces that unconsciously organize our lives.
— Richard Schwartz, Ph. D, No Bad Parts

AUTHOR Note - individual

Each Author is a Note with its Sources and Highlights (with its Annotations). You can sort them by Sources or by Highlights.
If you click on any Source or Highlight (and Annotation) you go to its “Source Note - individual”.

In the Author Note you can edit (metadata: name, image, description; metadata can be customized or keep the default) or delete the author.
You can “Add New Source” from him/her/it.

Current workaround

I’ve tried all the plugins I found and they haven’t work out, since they don’t do what I mention before or they stopped working —although the they didn’t solved entirely what I want when they used to work.
I list them here:

  • Kindle Highlights
  • PDF Highlights
  • Extract Highlights
  • Extract PDF Annotations
  • Annotator
  • PDF Annotator
  • epub Reader
  • Epub Importer
  • PDF++
    • From the author: This plugin relies on many private APIs of Obsidian, so there is a relatively high risk that this plugin may break when Obsidian is updated (learn more). For this reason, I hope this plugin’s functionalities will be natively supported by Obsidian itself so that we won’t need this plugin anymore.
    • Also, it still doesn’t solve entirely what I want.

Related feature requests

I wrote this two messages before:

Previously, I found this in the forum under Share & showcase My pdf and epub workflow but it doesn’t work anymore since the plugin doesn’t work optimally anymore. Also, it didn’t solved entirely what I want.

THANK YOU!!!

1 Like

Please stick to one clear feature request, thanks. Moved to Help.

Hi Ariehen, it’s just one feature I explained here, and its needed parts.
I deleted the previous topics with same title because there were no option to edit them so I had to create a new one for the updated one. This is the final and only one.
If I’m missing something, please let me know. Thanks.

Well, I understand your concern regarding having a third party plugin in Obsidian and their fragility in a long run. All users of Obsidian have their own way of using this application and their custom workflow. Because of the main USP of Obsidian, that it stores everything offline, it gives a lot of customizability power to its users. And Plugins supports further makes it highly powerful.

To comment on the stability of the plugin with the main application. If you are using any specific plugin and its support has been stopped, but you still can’t work without it. You have the option to stick to the older version of Obsidian.

Really appreciate for sharing your ideas for a integrated functionality in Obsidian itself for managing books, articles, etc. I was planning to build a plugin for the same recently. Well, my requirement was more of like speed reader or in other words using pdf/epub books as audiobooks, which you can read more here : Wanted a better approach to manage my offline books library - #6 by Tu2_atmanand

I have planned features like a custom view to show the homepage of the plugin, which will include :
- Recently read books
- The book’s thumbnail will be extracted from the first page of the PDF.
- There will be a progress bar for each book card.
- Each book card will have a button, to open the corresponding note, which will include the notes you are taking from the PDF.
- You can specify the folders from your PC, which might be outside the vault, to see all the pdf/epubs/audiobooks in a gallery format, to explore in a better way.
- You can also sort books based on Authors, year, genre, etc.
- Make collections to manually group books together.

From this post I got a lot of more ideas like having an online source. I early planned to work with only local files, but your point of storing locally and filling up space is valid, and it will be easier to sync as well. Following is a rough UI design I had in mind on how the plugin will be going to work, updated after the online source idea :

Now, my plugin will be majorly based on the PDF++ plugin itself, since I have found it to be the best plugin for working with PDFs, adding highlights and taking the corresponding notes in a markdown file. In the future, I believe we can come up with a better coding practices, so the plugin works better with the Obsidian’s core APIs, but for now, the plugin has been working without many problems, so can be used for daily workflow.

Also, the plugin ill be developing is nothing special, it’s just for a proper management of all books, articles, audiobooks, etc. and can be also done using dataview plugin, but I am not a big fan of dataview and would prefer a custom view rather than seeing the gallery view inside a note. I am not sure if I can do the above thing using dataview, like whether we can extract the thumbnails (the first page of PDF), and show them in the cards. I have found one dataview script to do that, but it’s pretty manual, so it can detect the progress of the book and all.

Edit : Here is the link to my repo for the plugin development : Library - plugin for Obsidian
Feel free to join the development or contribute by suggesting features and improvement for the plugin.

1 Like

Yes, it can be done.
Also, core Obsidian developers are friendly enough on the community bound to these plugins (to do with academic work, etc.) and are obviously aware that fundamental API changes would break plugins.
In the case of PDF++, many things could be taken on board for core Obsidian, even.

2 Likes

Agreed. I am actually going through the codebase of PDF++ to understand what bad practices has been incorporated by this plugin. Because, I feel, we all do use private APIs of Obsidian and some are even unofficial workarounds. But only because of a question was raised on the PDF++ Discussions, I think it gave a wrong impression of the stability of this plugin. All third party plugins are unreliable anyway, only those which are maintained or regular basis or which gives fewer bugs are safer to use.

But I strongly wish, Obsidian team will look into providing a better support for important plugins like this. PDF++ specifically have some cool features for annotation, highlighting and taking instant notes on those hie lights.

1 Like