Performance of Obsidian

Because Obsidian is a file system based system I suppose that there will be a lot of issues with performance. It could be painful for someone who uses HDD instead of an SDD drive.

Have you got benchmarks for how many files + tags + link can be managed by this app fluently?

After two weeks of using I have got 200+ files.

According to this velocity, I can easily reach 10 000 within a year or so. I am afraid if it still works and performs well.

PS. Maybe in the future, there should be created a separate index DB file where all metadata will be stored. The main data still will be reusable and portable but the app will work much faster and smoother.

I believe you should find some of the discussion in the topic below valuable.

Not really OK with the given answers:
No issues to expect at all -> tested vaults with 85000 Markdown-files and more without any problems

3 Likes

Thx.

@RikD - Can you share the Vault used for tests?

It will help me and others to check how it will work on ours hardware configurations.
Based on result we can create a benchmark DB for use for Obsidian users.

BTW. What storage was used during mention by your tests (where 85000 files were no problem for an app? The vault was saved on SDD or HDD drive.
How many CPUs it utilized and memory used?

Sorry - Not possible.
This is one of my Research Vaults.
Not able to disclose.
If you want to do volume tests you can use your directory structure as a good usecase.

You could use this Python scripts to create the nodes (Markdown-files)

import argparse
import os
import pathlib

class ScanDirTree(object):
	'''docstring for ScanDirTree'''

	def recurse(self, parentpath, parentfilename, dirlevel):
		try:
			if  dirlevel <= int(self.maxdepth):
				with open(parentfilename, 'w') as fpointer:
					for scandir_direntry in os.scandir(parentpath):
						if (scandir_direntry.is_dir()) and (not scandir_direntry.is_symlink()):
							direntry_filename = str(scandir_direntry.stat().st_ino)+'_'+scandir_direntry.name+'.md'
							fpointer.write('[['+direntry_filename+'|'+scandir_direntry.name+']]'+ '\n')
							self.recurse (scandir_direntry.path, direntry_filename, dirlevel+1)
				fpointer.close
		except Exception as e:
			print (e)
			# raise
			pass
		else:
			pass
		finally:
			pass	


	def __init__(self, arg):
		super(ScanDirTree, self).__init__()

		self.maxdepth=arg.maxdepth
		dir_level=1

		tmp_path=pathlib.PurePath(arg.root)
		if arg.root == "/":
			parent_path="/"
			parent_filename='Macintosh HD.md'
		else:
			parent_path=tmp_path.parent
			for scandir_direntry in os.scandir(parent_path):
				if scandir_direntry.path == arg.root:
					parent_filename = str(scandir_direntry.stat().st_ino)+'_'+scandir_direntry.name+'.md'

		self.recurse(arg.root, parent_filename,dir_level)

if __name__ == '__main__':
	aparser = argparse.ArgumentParser()
	aparser.add_argument("-r", "--root", help="Root of Directory tree", default=".")
	aparser.add_argument("-md", "--maxdepth", help="Maximum scan depth of Directory tree", default="2")
	args=aparser.parse_args()
	ScanDirTree(args)

Files are saved on SDD
CPU usage during stress test doesn’t go above 15%

Little remark:
Don’t think it’s “fair” to state that performance is not OK or under par.
I can run Obsidian on a Raspberry Pi and state here that performance is not OK.
Disk IO is a burden on stress tests.
Guess your not doing your tests on a 386-machine :wink: so CPU should be rather OK.

1 Like

Thx.

BTW. There is a discussion about limits and performance issues on Windows.
The problem is listing a folder. Some users mention that

I use Windows 10, and 25,000 images in one folder (average size of the file is 500 KB took more than an hour to load completely in the folder. The suggested number of files in one folder is 5000.

Maybe this is not a big problem on Linux or Mac but on Windows should be a consideration.
(BTW. even on Linux we are using the structure of folders on cache folders to increase the speed of access files)

The solution is to create a structure of folders and put files inside them

Don’t want to be harsh.

I believe this is not really what a Markdown based Knowledge system is built for.

My total research vault is around 150000 Markdown-files (read: text files) each around 2 to 5 Kb. This total vault is not as fast as my Personal Body of Knowledge vault with bigger picture files an an attachment folder (which does not pose any issues on speed - Even cloud based files with <img></img> embedding are not a problem (1 to 2Mb per file).

2 Likes

Thx. for an answers. I am happy that this is not an issue :smiley:

PS1 it was a quatation of another user from stackoverflow
PS2 Are you using Windows or other OS?

OSX Big Sur Beta 10 with 0.9.4 (for Obsidian)

Mac has the fastest SSD on the market and has enormous IOPS.
That’s why You don’t see any issues with performance.

I am talking about people who still use old HDD drives
(e.g in emerging countries, people who use computers only for private purpose or commercially but their companies do not see a necessity to upgrade or people from poor families etc.)

This is like comparing an apple to an orange.

According to this article and tests:

Here we see the HDD can do 176 IOPS, while the SSD gives us 9417 IOPS, or over 53 times more read requests. Since small file, 4K reads are the most common IO done in typical PC usage, this difference reveals how much quicker a SSD can be for a user. The 4K write IOPS show a stunning difference in performance, the HDD 311 IOPS, the SSD 32,933 IOPS, or over 105 times faster.

Note that the mentioned test compares an old first generation of SSD to HDD. Your SDD is much much faster than that Samsung.

There is an old tests of 2018 MacBook pro with Windows Laptops

The fastest Laptop with i7 CPU and SSD drive (run on windows) got 400MB/s while MacBook pro gets 2519MB/s

I suppose that your Vault will be very slow on HDD :frowning:

My proposition is to check and share information about minimal configuration of hardware for this software (according to Vault size)

You do realise that this is an article from eleven years ago?

And that this is a Mac magazine saying Mac is faster?

A number of users have run tests with very large vaults. When there’s a huge number of files, they noticed a slowing, but Obsidian was still usable.

Performance issues discussed recently have been around the speed of the graph and its impact, especially when it’s always kept live. But that’s been constantly tweaked so that it remains functional.

HDD will always be slower than SSD to load. Once loaded, it shouldn’t be an issue because plaintext is small and easily processed. Of course, if you add a huge number of images or videos your experience could be very different.

One thing I would stress is that these are really small files most likely backed by an SSD. The filesystem cache (in DRAM) can probably handle all of the files from a large repository without and performance hit. IOPS wouldn’t really matter

If you’re really worried about it then you can always warm the cache on startup by running vmtouch with some periodicity. It might be worth the Obsidian team to think about running a background process to do something like this but honestly I expect that this may be fixing a problem that won’t actually crop up

As I understand when Opisdian opens Vault it read all files and OS cache them, Am I right?
If so, an issue which I mentioned will be true only at beginning of using of Vault.

BTW. When I search, how Opsidian manage it? It opens all files in the Vault or it has it cached inside and just operates on an internal structure?

If no, on a slow HDD it will open all files, read them and do a search. When files do not exist on OS cache it will take some time. In the worst case, when HDD is fragmented it can last long. This is not any problem on SSD on HDD it is.

BTW. I’ll try to do this as an experiment. I have got HDD (5400rpm) for my Images files archive. I’ll write and run this script and share with you my results.

Rik -

How do you run on a Raspberry Pi? I have an 8GB 4B, but was defeated because the Ubuntu download was (amd64) architecture instead of (arm64).

I have not found anyone else here who is running on Raspberry Pi.

I was just being cynical (sorry for that).
It was a metaphor I used so you could state that Obsidian was “not good enough”.
Everything depends on the hardware you run it.
Furthermore came the response that one tried to load 25000+ 500Kb images from a hard disk. IMHO asking to be disappointed.
Hence my reason for the (maybe) harsh and cynic remark about a Raspberry.

The performance of Obsidian is something I care about deeply since I have a rather largish Vault. So perhaps I can give a real-world example.

My Vault as of yesterday had 8,641 .md files and 922,334 words. My computer is rather oldish (5+ years): Windows 7 64-bit, SSD drive, 8GB RAM, and an AMD A8-5600k 3.6Ghz.

I use Obsidian every day, including weekends and throughout the day. For me the performance of Obsidian is good. Compared with other apps like Zettlr the performance is great.

(I actually left Zettlr because it could not manage my Vault size when I had around half my files of now.)

Searching goes rather quick for me (<3 seconds). Files open instantaneously, both when navigating and through the file opener (ctrl + o). Typing the name of a file goes quick too, so no issue with the auto-complete window or auto-search.

Creating new files also goes quick. I don’t experience a typing lag (not compared to other programs). The memory usage of Obsidian is also rather lowish (for an Electron app, of course). And Obsidian hasn’t crashed on me or did funny things with files.

The only performance issue I found is that creating a new file with the file explorer window open is slow, since Obsidian redraws the file explorer window when making a new file. So now I simply create new files with the search window open, which is already my usual workflow.

All in all the performance of Obsidian is great and has remained so even with my Vault increasing every day. I think the Obsidian developers are great experts, because in the last months we also had updates that improved performance. (Which is opposite to what usually happens when more features are added.)

(By the way, when I open my Vault in Visual Studio Code opening files and searching goes remarkably slower. So Obsidian beats the well-funded Visual Studio Code team in terms of performance!)

So from my perspective there’s nothing to worry about regarding the performance of Obsidian. :slightly_smiling_face: (But I of course do save for a new computer!)

2 Likes

Totally agree with that!
I also left Zettlr for performance and no ability to handle large vaults.
I have a research vault with around 150000 markdown files (don’t know how many words) which is working impeccably in Obsidian.
I am a daily user too.
Very happy to have switched to some real software being able to cope with what I need it to do.
The Devs at Obsidian rock!!!

2 Likes

I also want Obsidian to become my primary note taking app, but it uses way more RAM than Notepad++.

In Notepad++ i have around maybe 300 files open in the past 6 years and it only consumes 10-30 MB ram and has one process open.
image

Obsidian has 4 processes open with 0 notes open and ~ 120 MB ram usage.
image

Well, at least Obsidian isn’t an electron app like Joplin and others which consume 400 MB-1 GB (like a basic Chromium browser)

Obsidian is an electron app.