Data Donation request

I think this is an awesome idea. Unfortunately many people’s notes contain private information or information that may be proprietary (mine included). I also expect that the way people build their vault (free form and on any possible topic imaginable) will make it difficult to train in a way that can be reused between people’s vaults.

A few slightly different ideas:

I don’t think we should need to be linking things manually but we should be guided to capture information consistently, eg. structurally, spelling, terms, etc. For me that’s the main function of links so I can refer to something the same way and therefore easily find it and build upon it. What would be helpful is a way to help maintain that consistency by working out similar terms, ideas, word roots, etc. and:

  1. be able to use this knowledge for search and ranking.

For me one of the weakest aspects, and biggest opportunities, of Obsidian is the search and ranking. It would be huge to be able to have a search that can take plain language understanding into account when ranking results. Doing this well also removes some of the need for manual linking and ultimately you might be able to use this knowledge to build graphs automatically on the fly.

  1. help a user during authoring to brainstorm, get novel input from their vault and to be consistent.

Imagine a pane that shared snippets of pre-existing notes, common terms or phrases, etc. that could help a user with recall, knowledge, consistency, etc. This is how our memories work and I see this as being a huge benefit. ML could build an index that would be used for this purpose - essentially a real time background search using capabilities from 1. above.

I would imagine a Wikipedia dataset being a good place to start (far from perfect because users vary widely in what they want to do). Take a subset, remove the links and then train it to create links and reward it when the links are as per the original dataset.