Addressing Large Datasets

i’m off this week, so i’ve been spending most of my spare time digging into my BUDGET vault & seeking a new way to enter, track & report my growing collection of expense notes (69 in all).

As we all know, the marvelous Dataview plugin has SERIOUS limitations on the number of records that can be placed in a single note…

…so i began scouring the web for information.

Then it hit me like a TON OF BRICKS: Obsidian is a specialized version of Jupyter Lab/Notebook!!! My budget/expense dataset would be handled quite easily in that environment.

That being said, i’m grateful for the Obsidian developers’ efforts with Canvas to expand Obsidian’s limits within the framework of Obsidian’s original Purpose…

…yet now i find myself at a crossroads: should i wait for Obsidian to provide the features i require, or pick up another tool (along with its learning curve) & get my project done now?

It becomes obvious that, with vaults reaching an increasing level of enormity, Obsidian is invading the Jupyter market space and, to stay competitive, is going to HAVE TO give its community a higher level of capacity to handle with EASE the ever-increasing datasets they are faced with.

Exactly my situation. I’ve only just started looking into Obsidian which is very slick and appealing in many ways. So I am very much hoping for encouraging replies to this post.

So far as i can see neither Obsidian nor any other notetaking software caters for large datasets, either external or by import. I’ve looked at Jupyter notebooks too and they certainly address the large datasets issue since they can address SQLite databases, R dataframes etc. But filtering and summarising those data would involve more work. I’d much rather use Obsidian which would be a much nicer front end, especially with the rich range of community plugins. I’d give up some performance to gain the convenience of Obsidian.

If Obsidian can’t support large datasets anytime soon, before you commit to Jupyter, also check out Simon Willison’s Datasette. But I’m hoping the developers and/or community here can give reason to stick with Obsidian.

(By the way, for me, “large datasets” means of the order of 10,000 to 100,000 or more records. Probably a bridge too far for notetaking software which after all was never intended for that purpose. But I’d like to hear that from a developer before giving up.)

