I posted a help topic the other day about dealing with the situation where I had a lot of assets that were showing up as unlinked, but were actually in use due to <img> tags being left in files created by the web clipper used to create them. (I had migrated my notes from Joplin to Obsidian.)
While I won’t go into the complete process for cleaning everything up (I managed to go from over 1500 assets files to ~300, and went from using 500M+ to ~75M), I did come up with an interesting use for the recent Base feature:
I created a base I called Unlinked Resources that I set up as follows:
- Filter set to file in folder _resources - this gave me a list of files.
- Properties set to display:
- File name
- File Size
- File Backlinks
- File embeds
- File Links
- File Full Name
This gave me a nice list that I could work with as I set about my cleanup process. The surprising side was that even after running Clear Unused Images there were still hundreds of images in my resource folder that didn’t have any links or backlinks.
I copied the list of the file names to a separate text file, and grepped all my vault to make certain they weren’t in use anywhere before removing them.
A few words about my results: it’s unlikely that anyone else would see the drastic results I obtained. I found a set of images in my resources (~200M) that were supposed to be transient files that I forgot to remove for some reason.
The reduction in the number of files was due to web clipped pages that had a bunch of small images that were given unique names. This resulted in up to 16 copies of the same image being stored in my resources folder. I built scripts to fix all the md files in my vault to point to one copy of the image, and then confirmed that the images were unlinked using the Base table referenced above, before removing the duplicates.
Finally, the reason this is important to me: I’m on a fairly limited internet connection due to my location. Syncing 700M+ of files between 5 devices is a bit prohibitive. I know it would only be the first time the devices synced, but that was only the primary reason: there are other reasons to undertake this cleanup.
Hopefully these pointers might help others if they are checking the health of their vaults out.
George