I’m looking for any guidance or suggestions regarding how to best organize my notes and other content. What would like to do is use Obsidian as an investigative system, similar to what the police might use to organize pictures, recordings, videos, notes, etc. for a criminal investigation. I’ll be importing a large number of digital content files and linking to them in various notes along with tags so I can use the graph and map views to better understand the relationships between everything. My initial inclination is to use a separate note for each individual piece of content and include the names of who is in the pictures/videos (I use digiKam to add facial recognitions to each file), text of any audio content (I use a speech-to-text library to generate that), objects in the pictures (digiKam again), date/time created, location data, etc. I would then create a ‘person’ note and link to all picture/video/audio notes that had that person in it, a location note to link to any content items that were created within a certain distance of the location, etc. I plan on automating a lot of the ingestion and content note creation and linking using a python script.
Are there any significant penalties utilizing what might be hundreds or potentially thousands of individual content items note files in a single vault? Would I be better off using a single note and creating a header/subheader for each item? (e.g. one notes for pictures, on for videos, one for recordings, etc.) and linking to the header/subheader?
from what i’m reading, you have what it takes to organize your vault properly, dear watson
no penalties, really
you can search for stress test on the forum
search for 10000 files, for a limit to do with quick switcher algorithm
these are not really dealbreakers in any case…
many people use obsidian for research so they use 4-5 types of files so using more smaller files would be helpful to pinpoint which stage the information is at or coming from
generally, smaller md files with transcluded content – adding auxiliary detail to a master file at a heading – would be better than really long files
that would mean sticking to obsidian syntax – ![[filename#heading]] or ![[filename#^block]] – but don’t worry about future-proofing…you can use obsidian forever, provided the third party plugins are level with the last core app
in the event you’d want to migrate your files, the syntax is usually easy to transform with regex replacements
but handling sheets of live-preview rendered markdown are constantly upgraded by the team so the app can handle long files on mobile as well
they also re-added support for markdown files over the size of 2MBs lately
If you plan to insert into your vault so many content, you surely will need a good structure/information architecture for your vault. that you will use for retrieving.
You already have a good high-level model (you have identified what they can be called “entities”, like people) and you have clear the idea that you need to implement “relationships” between entities.
Maybe It’s too early to foresee all the implementation details of your model. This will be clear when you’ll start filling your system with content and see what you really need and you still don’t have.
If you get to have a good set of metadata and links (for relationships) in your notes the system will scale even with hundreds, thousands of notes. But before arriving to an effective method you’ll need to experiment with the first few notes.
Try to design on a paper an initial design of your notes.
People note, what are its metadata (attributes), are they enough for the first searches that come to mind. Next, design relationships between entities
You need to consider a thing, Obsidian has a very basic database feature set (metadata, search, even use of dataview) but it is not a real database as Oracle or MySql. It could happen that if you have query requirements too sophisticated the system will become too complex to implement.
For the issue or modeling into a single note or one item per note, there is no general method I think. Database modeling is a loooong story.
You can try the different alternatives with a test and see what it works better.
Some ideas.
You surely have one people per note.
If you need to search a specific recording using its own metadata, maybe it’s better to have a “recording item note” rather than a note that contains the list of all the recordings. An entity note let you to implement a more rich model, using note metadata, than a block or bullet in a note
When you need a complex model, it’s better to separate entities in their own note.
Maybe a thing like Excalibrain could be useful, too, if you want to have a graph explorer with representation of relationship according to their “semantic”.
Thank you both for the feedback and thoughts. I’m still early in the design process and you’ve both provided excellent perspectives that will be useful. I’ve started playing with different setups and organizations to see what works best, and now I have some additional ideas to try out.