Obsidian rocks! - Really

RikD · September 12, 2020, 11:03am

Here’s something I definitively want to share with you all.
Sorry for the lengthy Topic note but I wanted to describe everything I am doing.
All the great PKM tools I allready tested and used in the past lacked a few fundamental things (for my usecases): reliability, scalability and most of all performance.
Obsidian seems to be doing a great job on these things so I decide to do some real stress testing…
But what system could I use that was big enough to test? After some thinking I decided to go for my Mackbooks filesystem with hundreds of thousands of files and directories.
I wrote a small BASH script to traverse the directory structure and create atomary notes for all the directories and also store backward links to the parent directory. The first script was written with recursion but I run into troubles because the heap stack was becoming so big (traversal goes in depth first) that even on my Mackbook the processing was too slow.
The resulting graphs also showed some ‘errors’ which is impossible for a file system. Further investigation of the graphs learned that, in OSX, directories with the same name exists all over the place. So I changed the script and added a prefix to the naming with the level of the directory in the structure… Better results but still NOK. Yes indeed in OSX there are directories with the same name on the same level. Next solution: Hashing the names on fullpath level. Turned out nothing can beat a SHA256 hash ;-). Some testing on 2-3-4 and 5 level traversal went great. The graphs are astonishing. They are also superfast.
This night I went for the “kill” (no pun intended) ;-).
I run the script to go 10 levels deep! This would create more than 800000 atomary notes. This morning almost everything was generated. Processing line 755923 of 830207 was showing on the screen. Aborted the script -> will run the rest later as I wrote the script to be resumable. It was time for the big test!
I started up Obsidian…
10 seconds to build the cache and “hoppa” the start screen showed up with almost 800K notes in the list.
Now I needed to find the root node which the script also printed out: A string of 40 characters was copy-pasted into the search field.
I was holding my breath (previous systems would need more than a minute to produce some result - sorry wrongly put - no system would produce a result because they would be unable to ingest the 800K nodes)…
No problemo! In less than 10 seconds the resulting file was shown. Clicking on it revealed the well known structure of my root directory. I used the [[Hash | Realname]]notation to be able to see something familiar.
Second test: navigating through the files: No problemo at all. Almost direct display of the notes. Comfortable navigation forth and backward with less than 1 second waiting time. For almost 800K+ notes this is AMAZING.
Third test: graphing… (I did local graphs of course. Not going to test 800K notes in a global graph and tell you it isn’t working)
Level 1 - Good. 5 second response time and graph is usable to navigate through the structure. Something ‘weird’ - The Graph exactly takes the same time to display the result throughout the whole structure. Is this built-in behavoir?

Back to the root of the structure for Level 2 graphs (depth=2) - What a beauty:

Again exact same timing to produce graphing results. Doing great by the wait. System is consistent. Graphs are great.
Back to the root for 3-depth graphs.
By the way - The same consistent timing for displaying the linked mention tab occurs also here (not using local graphs now). Timing is grosso modo 50% of the one with graphs but also very consistent the same. The atomary note is displayed almost immediate. After the set amount of time the Linked mentions pane is updated consistently.
Level 3 - Again beautiful graph - Good responsiveness when zooming in and out - Clicking through becomes slower (20 seconds to generate click through graph image). Same behaviour on fixed time of generation. Atomary note becomes available very quickly - Linked Mention Pane and Graph update at exactly same moment. Clicking through in the graph takes somewhat more time than clicking on the links in the atomary notes. Still more than acceptable behaviour when you consider the fact that we are managing lots of nodes here. When using the Linked mentions pane to navigate the update of the graph and the atomary note page takes a consistent time. The update also doesn’t go immediate anymore on the atomary notes page but is update simultaneous with the graph. Timing to update is consistent when navigating through the levels.
Picture shows a zoomed out three level deep graph with ‘/’ folder in the middle…

It took almost a minute to produce the 4 level graph… Containing probably 30-40K nodes so still acceptable. Zooming in and out are OK once the Graph is fully ‘ready’ which takes somewhat more time (no explanation for that one though).
First impression is that the clicking behaviour in the linked mentions pane changed. When going up and down in the structure the atomary notes seem to update immediately again which was not the case a level higher? Graphing takes a consistent 40-50 seconds to fully build. Use the graph to navigate takes the generation time to update the atomary note and the linked mentions pane which update simultaneously. In my case 30-40 seconds so not really ‘usable’ anymore. Trying to navigate with the linked mentions pane at this moments has crashed the system (black screen). -> View-Reload to try that again.
I am now on a node with no parent and it’s not the root node? (some error in the script or not finished processing? I will investigate with complete generated system later) -> Searching again for the root node -> Graph displays fine after 5 seconds -> Setting depth to 4 levels -> Ready after 30-40 seconds -> start clicking test again.
The zooming and panning isn’t as fluent as in the lower levels. It takes a lot of time (not usable anymore). Normal considering the number of nodes to process on this level. Navigating through the Linked mentions section is taking a long time again.

Conclusion
Guess nobody in the world will be working with vaults of 830K nodes. Still wondering about research vaults though (one of mine has 180K nodes).
We can safely put that Obsidian stood the test with glory.

GREAT JOB and well meant kudos to the makers of this magnificent piece of software!

All feedback and remarks more than welcome by the way!

Klaas · September 12, 2020, 5:12pm

Interesting test.
When you say you produced 800k atomic notes, what does that mean in practical terms? How big are those notes? Are they in any way comparable in size (in KB) to regular atomic .mdnotes?

In your screenshots of the graphs those notes are linked, and it looks like mostly 1 link per note. Normal zettelkasten type atomic notes have more than 1 link.

RikD · September 12, 2020, 7:35pm

Hey Klaas,

It’s a test. The atomic notes are small.
They don’t contain one link per node.
Depends on the number of subdirectories a certain directory has.