The AI revolution is here. With tools such as ChatGPT, we get a glimpse of what the future may hold and start to better understand what large language models will eventually be able to do for us.
One key aspect of large language models is the data they are trained on. This effectively is what defines what a model will be able to know, and consequently, do. It as such becomes critical to feed the network with the correct information, as this alone can unlock features and capabilities impossible otherwise. Until now, all large language models available to the public have been trained on mostly non-personal data, such as subsets of the internet. This however will not cut it if we strive for powerful and tailored personal assistants.
Tailoring a large language model to an individual is not a mystery, all that is really needed is access to information relevant to the individual in question. Some of this data is (theoretically) easy to retrieve if present. Examples would include:
- Emails
- Online calendar/schedule
- Browsing history
- Programing time (a number of solutions already exist which can be integrated into most IDEs)
- Contacts/social network (friends + activity on social media)
- List of todos
- Finances (sensitive, but a number of large banks already provide means of allowing third parties to access such information)
- Health data (smart watches, connected scales, sleep trackers)
- And much more (smart homes, phone GPS, etc…)
Those alone already allow for getting a solid grasp of an individual and should be enough to achieve a high level of customisation. It does however exclude a significant aspect of an individual; their “knowledge”.
Why is Obsidian so special?
While it is possible to partially derive that from one’s browsing activity, all of the above sources do not allow for properly grasping the extent of one’s knowledge, which I believe to be fundamental to properly understanding and assisting someone. Allowing your personal assistant to better grasp your knowledge base could enable the following:
- Understand which areas you are knowledgeable in, and which areas represent weaker points: This could in turn be used to determine the level of details and complexity that should be considered in responses to prompts. A good teacher isn’t one that just knows everything, but also understands what you don’t know.
- Detects areas of misinformation: Being able to access a knowledge repository enables for detecting errors/misunderstandings, which could in turn be reported and notified to the individual
- etc…
These personal data repositories we have all been creating could be significantly enhanced further through the inclusion of effective time tracking. Being able to properly log the amount of time spent on various notes would provide a tremendous amount of insight into the interest, capabilities, or workflow of a person. This user activity could allow for a new form of metaphorical linking too, which I would refer to as “activity links” or “desire links”. Akin to “desire paths” (see the figure below), those links would be created by the natural evolution of the user in his pkm environment.
To clarify, I am not talking about necessarily creating actual links, but rather capturing this natural relationship between notes, to be eventually leveraged by large-language models and users (a dashboard of one’s activity in obsidian would provide valuable insight into how time is split and allocated to various aspects of work/life)
A plea for help!
I have previously made a post on this topic on this forum, describing how logging the opening, putting in focus, and closing of notes could be coupled with the note’s path and tags to effectively track this data with no extra effort. The true value of such data only increases over time, and every hour spent without it is fundamental information missed-out on. If anyone with the right skillset could take on this challenge, this plugin could allow for starting to harvest this invaluable information, while remaining in the obsidian ethos of a local-first, and privacy-centric approach. Even better, building this directly in obsidian to ensure a reliable and accurate log could prove even better.
I am excited to be on for the ride with you all and look forward to hearing your thoughts on the topic. I would like to then leave you all with a question to further brainstorm on this topic: as we stand at the edge of a new revolution, what else could be done with obsidian to further anticipate and prepare ourselves to truly unlock AI-assisted knowledge tools?
To conclude, as we enter this new era, obsidian happens to find itself at an exceptional crossroad, and stands with the potential to radically change the capabilities of the upcoming tools of the AI revolution. Obsidian marketed itself as a second brain, and might end up living up to it more than anyone ever thought possible!