Block reference for pdfs and pdf highlighting in Obsidian

I think it would be a great improvement for knowledge workers if you could add a block reference function for pdfs together with the possibility of highlighting pdfs in Obsidian. In this way, one could use Obsidian as a notes management software like Citavi - Literaturverwaltung und Wissensorganisation, but with all the advantages of markdown syntax.

8 Likes

I think it’s an important request. But may drive the project out of focus. Most scholars need that. obsidian built on markdown and not be in vendor lock. more feature request will slowly put us in a vendor-lock. In my opinion there is a need for light text format between latex and gfm.

Cloud you please elaborate a bit on that? I’m not an insider, so I still didn’t get what the “focus” of Obsidian is – if there is one focus at all.

In general, fully featured PDF managers are expensive.
Well featured PDF editors aren’t cheap either - you can try free versions, but they all have disadvantages.

And they’re not markdown.

Highlighting a PDF can be done as an edit (ie it changes the readable/printable PDF) or as a layer (where the highlighting is contained within the program but the underlying PDF is unchanged).

Doing anything with blocks, requires text. If that isn’t already separated, then OCR will be needed. Same for line highlighting.

If you’re doing this for PDFs, there will be requests for doc, docx, odt etc (which Citavi also does).

And reference management, citing etc would another set of functions on top of all this.

It would take a humungous plugin to do all this.

What might be more feasible would be text scraping from documents (maintaining a link to the original file), converting the text into a markdown format (there would need to be a way to manage images), and applying block identifiers to blocks in the text. The plugin would need simple ways of adding notes and highlights. But I don’t see someone doing this in the very near future.

I do see this as a plugin suggestion rather than a feature request.

I see your point. But I wasn’t asking for an Obsidian version of Citavi, but for something (maybe) more humble like the last thing you mentioned:

What might be more feasible would be text scraping from documents (maintaining a link to the original file), converting the text into a markdown format (there would need to be a way to manage images), and applying block identifiers to blocks in the text. The plugin would need simple ways of adding notes and highlights. But I don’t see someone doing this in the very near future.

I would therefore push to have something like this in the future. I will put something in the plugin suggestions category as well. Thanks for your advice.

1 Like

I think with the new realest 0.10.8 Obsidian is going to that direction.

There is still a need to a highlighting function for PDF, though. I will write a feature request.

3 Likes

I had similar requirements as you and since there was no way to fit them in Obsidian I wrote a plugin for it. This plugin allows you to embed individual PDF pages or even extracts from them in a note. I use it with the concept of source notes as bridge between pdf and my Obsidian notes.

A Bridge Note could look linke this:

Link: [[Fuhren in die postagile Zukunft.pdf#page=110]]
Topics: [[Führung]]
Type: #literatur #schaubild

{
	"url" : [[Fuhren in die postagile Zukunft.pdf]],
	"page": 110,
	"scale" : 1.5,
	"rect": [40,355,270,400]
}

Take a look at the Better PDF plugin.

Maybe you could help out with Keypoints development so that it will be released finally.
That would cover all steps of academic pdf workflows.

1 Like

I would also appreciate a humble version of PDF highlights and references.

The extra layer could extract the highlights and comments to a HTML file with all the data tool needs and then export only the highlights and comments to markdown, which could be referenced in Obsidian.
Instead of a identifier code as in normal block references, the quotes and comments could have links.

Bumping this thread in case there are other workarounds. Currently using DT for ODF management and Obsidian for my text

Having a workflow as implemented in Logseq would be great. Using Logseq for PDF note-taking - YouTube

2 Likes

That looks great :open_mouth: