You have full text search now. An index is a database - you talked about just adding an extra column to the table. It’s grep against an index. There’s no free lunch: whichever approach you take there are some gains and some losses.
Yes, but it’s not indexed. That’s the whole point.
Thanks for the heads up on this, @Dor. I am concerned about scalability and would love to know more about the vault organization strategies to which you are referring.
Can you recommend a good starting point for learning more?
Thanks in advance.
@ryanjamurphy ~ Many thanks, Ryan! Very much appreciate this helpful tip for learning more . . . .
Thank you, @Dor, for taking the time to provide this great overview.
Just to add to the discussion. I have over 9000 densely linked notes with little to no performance issues (late-2018 MBP, 8GB RAM, 256SSD, 2.3 GHz Quad-Core Intel Core i5). Only issues are initial load of vault graph view. Other than that no issues.
DEVONthink could certainly handly WAY more than this - hundreds of thousands, based just on files (though it is databased in ways).
Thanks for this update. I would like Obsidian to be able to handle the load that you mention devonthink handling. 9000 is great, certainly impressive, but in many other industries, still a very low number. Would be absolutely lovely to see attention and resources temporarily diverted from the nifty-cool-shiny-handy-smart feature updates, to just a little — just a little-- time toward adding a robust, scalable, industrial-strength indexing (and other) under-the-hood technology to let the thing shine at much higher numbers of notes. If we can see future-proofing at 1mm to 5mm notes, then noone will have to even bat an eyelash at 200k… You want loads more buy-in from corporate customers for enterprise licenses and tons of sync subscriptions? Make these changes, I suggest.
You’re definitely not wrong there. I would imagine that would be dev in tandem with native applications. Because electron would be too intense on resources. Different project, but look at TheBrain 1mm+ notes are feasible.
My two cents on this: The devs consistently spend time optimizing for “large” vaults, which is why the app almost doesn’t feel like an electron app. Given that it’s in beta, I would argue that optimizing now for millions of notes falls in premature optimization.
There are other features that are important for the sustainability of the app and other features that would make the app even better for the vault sizes we’re seeing today.
I’m not saying that this is not important, but it might be good to adjust your expectations on this. I’m sure it will come in time! If you do have some performance requests with large vaults, I’m sure the devs will appreciate more data points!
Well, not what I wanted to hear, but it is what it is, and so I thank you for your input.
I would venture to think that you’re not seeing vault sizes that big, because the product is so new, and noone has built up a 5mm note vault. But – not because they are not needed or wanted, that’s not necessarily the case, I believe.
In the (hopefully short) interim between now, and screaming hot performance at vault-wide full text searching at 2mm notes, is a workaround to just split the vault? Ten vaults with 25k notes each? But then these cannot be searched all at once, right?
ripgrep is a good example of how a highly optimised tool (multi-threading) could search and find results from hundreds of thousands of files within seconds. Visual Studio Code is also using ripgrep as their default search tool. The tool is isolated in the VSC code so there’s the option for the Obsidian team to integrate it in.
Most other indexing tools and systems have to deal with binary data, as Obsidian only uses text files the search solution does not require indexing.
Well I’ve heard about vaults of varying sizes, anywhere between 200 to 40k notes. I believe there is one bible vault with around 80k notes too. You might want to ask some of those users directly in the discord server about search or how they manage their large vaults.
Question of atomicity
My first thought is that you may want to try out DevonThink, which is a knowledge management software that may serve this purpose well. Basically I use DevonThink to collect everything, and use Obsidian to organize my own thoughts/notes.
The best part is, I can add the Obsidian vault to DevonThink, so it can interact with “raw inputs” without any headache.
that definitely sounds like an awesome worklow, seems like devonthink is mac only. Do you perhaps know of any alternatives for us Windows / Linux users?
LOL - the answer there is no, I’ve searched for years.
that’s a shame, I’ve also been looking for a while. Really hope one day there’s a good alternative for non-mac users
While I don’t know all the secrets Devonthink offers, maybe some parts of it (namely indexing) can be tackled with fzf and rofi? There’s also zzzfoo that uses recoll?
Interesting… discussing a theoretical issue that probably nobody in real life has.
Can you imagine all the notes you could have written while discussing this.
Unhelpful. Just because you may not have and issue, doesn’t mean others don’t. Many may want to deploy obsidian in the enterprise, so scalability and performance are actual concerns, interests, and unf roadblocks to adoption.