Maximum Number of Notes in Vault

Thanks for the tip. Although I have tried Joplin, I have tried importing from Evernote via Joplin. Will definitely try this.

Thanks for the suggestion. As for me, I had all the dummy files in the root. As mentioned, I haven’t tried the sub-folder structure, which I will be testing. Will keep your tip in mind when I do that.

If you plot points on a 2D graph of (x-axis size, y-axis time or space), curve fitting is finding a smooth mathematical function that closely fits those points. See https://en.wikipedia.org/wiki/Curve_fitting

The idea is then you can estimate space/time use for larger numbers without actually testing them.

One more point about many small files is that (as I think @ksandvik pointed out) the file explorer won’t work well. You’d have to use a search criteria to reduce the number of files shown on the graph or file explorer (sort of how search works on MacOS finder). The current search window shows some content of files but when you have too many files that may not work well either.

Thanks for elaborating and sharing the link. I will check and see how I can use/do it. Regarding the second point, yes @ksandvik did suggest filtering using search, but didn’t quite get it. If I am not mistaken, you both are suggesting that I use search, immediately when I open the application to show only a select list/group of files and not the entire list. Please do confirm!

Hi @ksandvik, I just wanted to update you that I tested after creating sub-folders and it performed better than posting all the filed in a base folder. Loading up the folders does take a few minutes. Also wanted to mention that the Graph plugin does not work. However, when the number of files are in the range of 200k it does become slower and almost impossible to operate.

But overall I think having an elaborate folders/sub-folders structure is definitely a better model to follow (till something else is proven better). :clap: :clap: Thank you very much for your tip. :smiley:

I will now follow this model with real notes and see how it works. With a large number of notes (like 200k), I will have to develop a workaround with something like an index/tag database, as the benefits of using Obsidian far outweigh these limitations.

1 Like

I did a test. When there are 12,000 notes in Obsidian, daily operations can feel sluggish. Most functions except Graph View can work normally, but the user experience is seriously degraded. If I creat 20 notes a day, I will reach the limit in less than 2 years.

2 Likes

Joplin works better for me than Notion (without the need to transfer either) - better embedded links etc…

The only thing I did was to batch replace " -" (4 spaces) with “-” in for all of the files Notepad++. Without that, I get a lot of red text where there are bulleted lists

1 Like

Thank you for sharing! I had replaced Evernote with Joplin.

Is anyone using Obsidian with 10k+ notes within iCloud for sync? I’m working on migrating from Evernote, where I have >11k+ notes but I am still wondering:

  1. Will Obsidian be performant enough with that many notes (especially with search). Still unsure about this…
  2. How to handle sync? My preferred solution is Resilio Sync but their iOS app is not strongly integrated into Files.app—Sync only occurs when the app is opened (not in the background). So that’s not ideal for editing on the go. iCloud or even MS OneDrive is better in that regard, but I really love Resilio’s “no cloud” model and have paid to use it.

Curious to hear more about what others are doing…

This is disturbing, performance problems at 12k notes. Should be able to handle 100k and well beyond without problems, as far as I am concerned. Is there any update on this issue?

4 Likes

Like any other application, more intensive workloads demand more from your computer. What kind of machine are you working with?

windows 10, 24 RAM, intel i7 probably gen 7, 256GB SSD.

my concern isnt necessarily my machine, but the scalability of obsidian.

there are other note/mindmap/md type apps which stress test to a million notes, things like this.

im not talking about some esoteric application like importing the library of congress, just something for sales people or lawyers, or researchers who may wish to collect and reference, say, 100 thousand contacts/documents/reference notes/case histories, etc. So if you’ve got 100k, not hard to get to 250k, so stress testing at a million is a logical place to go.

which is the reason when i read here that some were testing at 8, 10, 12 thousand and having a sluggish response, i immediately knew it wasn’t at a point where it was scalable.

of course, i stand to be corrected.

also, it would be great to know if/when some instant search could be implemented using a vault-wide full-text index (? SQLlite, etc?) so that the sluggishness then would not be perceived, as search results would be fast.

2 Likes

I don’t believe it is the logical place to go at all.
Obsidian is based on files. There is no index, no database (though someone could always decide to make a plugin). Speed (initial loading in particular) will be limited by hardware, OS and transfer speeds. That’s about files, not Obsidian in particular. There will be scalability limits and they will be lower than for a database. There are ways of organising vaults so that scalability is unlikely to be an issue in everyday use, even when the total number of notes is huge. And I’m not aware of anyone having encountered an underlying speed issue, with the exception of huge databases when using the graph.

However, the big advantage of files is that they are not locked up. It is easy to set up any number of indexed search databases with the same set of files and use those for search etc. The reason that database apps have to examine scalability is that this is not possible there because the data is locked in the database and search etc will rely on the coding of the app and the particular design of its database.

4 Likes

Does Obsidian lazy load the vault (e.g. folders are not traversed until they are clicked) or is every single file in the vault checked/read when launched?

I am migrating from Evernote where I currently have over 12,000 notes. I plan to break up a lot of them into smaller zettels as well to make better use of Obsidian’s cross linking. So I could see it easily exceeding 20,000. I’m worried about speed as well. I hope after 1.0 some effort might go toward optimizations or some sort of search index.

1 Like

There’s extensive optimisation already.
Personally, I don’t want an index in the program unless it is something I can turn off. So I prefer the idea of it being available through plugins.
Because they’re just files I find it easy simply to use whichever search engine I feel is best for the job, and because they’re outside Obsidian I don’t have to worry about it impacting processing within Obsidian.

Everyone’s situation is different. Some people have huge numbers of image or video files embedded in notes, others are concerned about the graphs and want them updated in real time. If you’re talking simple plaintext, notes rather than long documents, then I wouldn’t worry about 20k. But it might also be worth thinking about whether you are best with one single vault, or whether you’d be fine with more than one vault, or whether you’d prefer a system on nested vaults where you spend most time in small sub-vaults but can also work in the big vault with everything. If you think it’s possible you will hit a speed issue, it’s worth spending time considering the optimal organisation of your data. But, personally, I wouldn’t be worrying about 20k.

I didn’t know nested vaults was a thing. That’s definitely an interesting idea @Dor When you talk about using “whatever search engine” do you mean something like ripgrep? Or something else?

ripgrep is an example, but I don’t think it is indexed.
Docfetcher is open source, indexed, but stopped being developed a few years ago. Works in most file types including PDF, doc etc.

at the paid end there’s a host including dtsearch, powergrep, filelocator pro etc.
Usually these are only needed to deal with multiple complex file types.
Everything would depend on your purpose and what features you need.
But, as I say, if you only have 20k plaintext files, I wouldn’t expect speed to be an issue in Obsidian, and I’d expect an indexed plugin to appear fairly quickly if any users started to run into problems.

1 Like

Interesting. I absolutely ~do~ want an index. Why wouldn’t almost everyone?

  1. I was told that meta data, tags, these various things are already indexed. This was on discord last week. But full text is not. So why not take it to the logical next step?

  2. loads of other KM/CRM/NOTE programs do have full indexing for important reasons. Outlook. Thunderbird. Most of the various CRMs. I don’t know about evernote. EssentialPIM uses FirebirdSQL (or maybe SQLite by now?). These are not expensive and cumbersome, and resource heavy. TheBrain uses built in Windows indexing that’s already there and piggybacks on it to deliver almost instantaneous results. Full text.

  3. Why put to some other resource a task which can be/should be inside the app you’re already in? I find it hard to imagine this being anything other than a critical function: finding, somewhere, the important information you’re looking for in your vault, so that you can continue carefully crafting the masterpiece you’re working on.

  4. This multi vault/nested vault thing might be seen as an excuse for bad/limited performance (in this narrow category of function). Why would you subdivide, separate, and otherwise silo your information? I mean, you could, and you should have the ability to, if you wanted, but not because you’re forced to because you’re soon going to hit some performance ceiling on finding your stuff, or navigating your reams of critical information?

  5. To sum it up, isn’t all this about finding? I mean, isn’t google’s success (not the advertising part) because they realized the Excite and Yahoo (et al) trying to make neat little indexes and categories wasn’t working, and that soon there will be so much information that we’ll be spending most of our time categorizing it all, and why do that when soon, search will be so powerful and instant, that, well…

Yes, logging and writing and thinking and linking, you’re building and assembling and creating. But don’t we need to ~get~ to bits fast, to be able to react to that fleeting idea in the head that there’s an epiphany, and I gotta make the connection now, before I forget? How can we build vast neural link networks if we forgot where stuff is, and can’t get to it quickly? These light flashes are so brief. Wouldn’t reliably fast search and agile scalable performance make Obsidian even more unbeatable than it already is?

Thank you @Dor, @ryanjamurphy, and @luckman212 for your thoughts thus far.

4 Likes

They’re all database programs. And Outlook is pretty slow despite that.
Obsidian has been marketed from the beginning as a a program based on separate markdown files. I wouldn’t be using it if were a database program.

Full text searching wouldn’t just about one index, especially in a program like Obsidian with a number of important components. They too would have to be indexed and you would end up with a database because all the words would be in it. Optimising that would make separate files, if they were still maintained, an inefficient sideshow.

From the sound if it, you would prefer a program based on a database. That’s OK, they have advantages as well as disadvantages. But it’s not what Obsidian was designed to be and many users came here specifically because it used files rather than a database.

Why should it be in?
In is not better, it is simply a restricted choice.
How many programs do you use? Why aren’t all of the functions wrapped into one?