I am currently writing a plugin that extends the annotation features of the internal PDF viewer. While I was testing my plugin for performance and memory consumption, I noticed that Obsidian keeps on allocating more and more memory when opening, closing and then re-opening PDF files over and over again (can be the same files). After I investigated my plugin for potential memory leaks I tested this on a sandbox vault. For 20 PDFs, opened and closed in each iteration, there is approximately 20 MB memory, which does not get garbage collected. I observe this behaviour both for the pure sandbox vault and the sandbox vault + my plugin.
Another observation: Bigger PDF files produce bigger left over memory. But I haven’t tested this extensively.
Is this a bug or a feature (some caching)?
I’ve created some heap snapshots, which I can share if this is worth debugging.
I also noticed there are some detached nodes (detached elements in memory tab on dev tools) from the PDF viewer DOM tree. Could this cause the persistent memory overhead?
Did you follow the troubleshooting guide? Yes
Steps to reproduce
Sandbox vault
Get some valid PDF files (the bigger the greater the memory impact)
Open either the same PDF file multiple times (different tabs, no need to be visible (split tab)) or different PDF files. For a good effect open like 15 small-sized PDFs.
Wait some time for async jobs and/or garbage collect to settle (30 sec is usually sufficient, for garbage collect alternatively trigger manually at dev tools → Memory tab → brush symbol)
Close all PDF file tabs
Wait some time for async jobs and/or garbage collect to settle (30 sec is usually sufficient, for garbage collect alternatively trigger manually at dev tools → Memory tab → brush symbol)
Measure memory alloc
Repeat steps 3 - 7 at least 3 times
Expected result
Memory alloc after closing all PDF files should fall back to the level before opening.
Actual result
Part of the memory increase after opening the PDF file tabs persistent after closing the PDF file tabs. This accumulates on repeats.
Environment
I tested this on Ubuntu 22.04.5 using some test PDF files from the PDF.js repo (https://github.com/mozilla/pdf.js/tree/master/test/pdfs, not all of them can be used as some are faulty PDFs used for tests on catching such faults, but filenames should indicate that)
Device:
Distributor ID: Ubuntu
Description: Ubuntu 22.04.5 LTS
Release: 22.04
Codename: jammy
Kernel: 6.8.0-52-generic
GNOME Shell: 42.9
Obsidian: 1.8.9 (installed via Snap)
I created the sanbox vault manually (not via help tab):
contains only test PDF files
no third-party plugins(including my plugin) activated
Oh ok. What kind of PDFs did you use? How big were they?
The 20 MB are adding up each open/close iteration and this does happen for the test PDF files from PDF.js. They have sizes < 100 kB. If I do these open/close iteration on bigger files, there is also more memory left uncollected. At least thats what I observed.
I write a plugin which will add some annotation features to the current version of the internal PDF viewer as follows:
edit annotations
create/delete annotations
multi view sync (real time)
changes are reflected async to all active viewers of that PDF file
undo/redo functionality
In addition I want to extract annotations to markdown and render the page the annotation resides in. I think I can avoid starting up 20+ (number of annotations) PDFViewerChilds by first checking which PDFs are present in the markdown file, then just render the areas of each annotation and kill the PDF viewer instance afterwards.
One use cases where this could be a problem:
When I’m sighting new papers, I want to open a lot of papers, view them side by side, maybe even the same PDF file (therefore the multi view manager). So when I read/annotate papers in this sighting process there will most likely be a lot of open/close actions.
I do not want to restart Obsidian, when memory gets clogged up, because that would also kill my undo/redo stack.
TL;DR: When I test this on bigger PDFs there is more memory accumulating in each open/close iteration. I had cases where Obsidian was using nearly 2 GBs of memory, but I had only like 4 PDF files open at that moment (bigger ones, but still around 10 MB). This happened due to open/close cycles, which I haven’t done on purpose in that case. Just happened in my work/plugin test flow.
In the end you are probably right. This happens only sometimes and restarting wouldn’t be a deal breaker for me. But nonetheless I wanted to let you know I see a potential (maybe small) memory leak there, which is worth a fix (if you can actual reproduce it, maybe on bigger PDF files).
I have tried with 30 MB pdfs and I do not see any meaningful leak happening. But anyway, we will double check if there is anything.
If you aren’t aware, there’s already a plugin for PDF annotations (perhaps you may want to collaborate on the already existing plugin) and PDF annotation is a planned core feature of Obsidian.
Ohh ok, how many load/unload cycles have you done with these PDFs?
I will double check on my side as well.
Thanks for pointing out the other PDF annotation plugins. I have tested most of these plugins, but I think they have different features they focus on.
My plugin tries to monkey patch the annotation features (create/edit/delete + undo/redo stack + multi view manager) into the PDF.js viewer directly. It might be useful until you add PDF annotations as a core feature. Are you planning to implementation your custom editors or do you want to use the PDF.js annotation editors?
As a result of this investigations for this BR, In Obsidian v1.9, a different load/unload strategy for PDF.js will be implemented. This changes may or may not solve your problem. Regardless, I consider this issue closed because, again, we were not able to detect a significant memory leak.