Formatting issues with .pdf plugins on Obisidian as well as copy-pasting

Hi everyone,

Have been loving Obsidian, but am having one big, constant issue that dramatically slows down my workflow. I am writing a PhD, and would ideally be going through like a book a day, and taking lots and lots of stuff from many of them, not just one idea that I can put in my own words.

Copy-pasting into Obsidian is tedious, especially because the formatting gets messed up, and I get a lot of unwanted “enjamb- ments”. Also, you get two linebreaks before each paste, so copy-pasting text over a page break requires lots of extra clicks or key-presses. The community plugins to extract highlights from .pdfs have the same issues with formatting.

Despite being an academic, Zotero is still completely beyond me, and I don’t have time to learn it right now, in case that would be a solution. I read mostly books rather than articles, and I obviously pirated most of my thousands of .pdfs, so I don’t have the metadata and have not yet had to face up to the challenge of actually entering it all. God save me in a year or two…

Is there some workaround for the formatting issues I seem to get with every single pdf with both copy-pasting and with the Obsidian plugins? How do other folks deal with this issue?

Thanks!

Can only recommend this: PDF managment via Zotero. (The retrieve metadata function will safe you more time than you’ll invest in learning how to operate it.) They have excellent documentation. Two crucial plugins in Zotero make all the difference: a) Zotfile retrieves all your annotations and hightlights from your PDF into Zotero (b) mdnotes which will turn those notes into markdown (and remove probably most of your unwanted enjambments).
I also use a software called Hook to create deep-links (in markdown) connecting my pdfs with respective Obsidian notes. (The app is ingenious and creates links between just about anything that you’r likely to need).

Getting to grips with the complexities of learning and establishing a workflow with new apps may sound daunting. If you consider your PhD and the mass of stuff you need to process, excerpt, retrieve, keep track of a few solid apps and a clear workflow will be your friends. I practically live in Obsidian and, like you, have a sizable digital library; the above setup – after many other trials – works well. Good luck for your work.

2 Likes

OK, thanks Marc! I guess it wouldn’t be the worst thing in the world to actually get Zotero up and running. I knew at some point I would have to in any case… I’ll continue with my copy-paste ways till after my next deadline, then settle in to some data entry and learning the ropes.

If anyone did happen to have an interim solution, I’m all ears!

Just by the by, when you say “learning to operate” the retrieve metadata function in Zotero, do you mean learning to find ISBN numbers for the books and let the program do the rest? Seems like that should be quickest way right?

Oh, and are you saying that the formatting issues that exist with the other PDF plugins in Obsidian don’t exist with Zotfile?

Thanks again!

Install the Zotero browser extension. In 90% of cases, you’ll just be able to tap that button on the resources you’re using and download all metadata straight into Zotero.

Zotero is also able to generate metadata automatically from many PDFs.

Re metadata: @ryanamurphy stated what I didn’t even mention: the Zotero connecter (a plugin for your browser of choice can often graze the metadata from one of the many catalogues and publications that expose meta-data. (worldcat, amazon and google generally do a good job for books). You can also use the inbuilt “retrieve metadata for pdf” from within Zotero which generally does a good job.

PDFs: I personally don’t use Obsidians plug-ins for pdf. Since I manage them with Zotero and use Zotfile to retrieve my annotations and highlights from the actual PDF I let mdnotes (in Zotero) transform them into Markdown and fetch them into Obsidian as markdown files.

Re the Obsidian PDF plugins: Alexis (@akaalias) who made some of them might help if you address your feedback to him.

Does Zotfile do a good job of preserving formatting of highlighted text when it converts to .md? This would for sure be a draw for me.

I spent another 3-4 hours tinkering around with Zotero, and I don’t understand everyone’s optimism about it. The native ‘retrieve metadata’ operation, for example, seems to fail a very large proportion of the time to find anything at all, even on fully-OCRed, great scans of popular texts. The plugin on Firefox is similarly persistently faulty. For example it sometimes fails to input ISBN numbers which are clearly visible on the web page, and thus only creates half-filled entries in desktop Zotero. Finally, I started creating items by inputting the ISBN manually, and it still has a tendency to leave out vital bibliographical information, such as co-editors of collections of essays, and the like. Not to mention consistently failing to capitalize titles (though it’s possible that is a settings issue which I haven’t discovered)…

I guess I feel like if I have to manually check every single entry, which I can’t see any way round at the moment, I might as well just do the whole thing manually and at least be sure there aren’t any mistakes?

> Does Zotfile do a good job of preserving formatting of highlighted text when it converts to .md? This would for sure be a draw for me.

This is a random sample: annotated PDF > extracted annotations with Zotfile into Zotero >changed to md via mdnotes within Zotero. The resulting md file is saved to a specific folder in my Obsidian Vault via an alias of that folder in my Downloads target. (Left: md-file - right Zotero Note.)

Sorry to hear you’re disappointed with performance of metadata retrieval. I generally touch up some of the details per entry but overall have good results with the automated retrieval - YMMV. Your best bet may be to find a good catalogue site in your field that exposes metadata in high quality and graze them from there.

Thanks very much for showing me this. The quality looks good, and I guess I hadn’t fully appreciated how incredibly convenient it is to have a link to open the page my extract is taken from… Real game-changer actually.

I will definitely put the time into getting Zotero set up in the coming weeks having seen this. Cheers.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.