I operate within Facebook a lot. I post a lot, I’m active in numerous groups, etc. And I create a lot of content on the fly this way. I have resources I’ve come across and shared (sometimes with my commentary), posts I’ve composed, and other ideas I’ve formulated as parts of conversations that came up in response to questions. It makes sense for all of this to become notes – but there’s a lot of it.
So what I’ve been doing is downloading a backup of my FB content every month. And then I open the posts file, the comments file, and the group posts & comments files (which are HTML), copy-and-paste the text into Obsidian (all in a single monthly note, to begin with), and then go through and delete the posts I don’t need to keep.
To give you an idea of the volume, the Reading Time plugin estimates each of these monthly notes to be about 1-4 hours worth of reading, after I’ve narrowed it down to what I’m keeping.
And then those have to be broken apart into separate notes.
Also, if I share information from my database to Facebook, I’m then re-downloading it later and unwittingly duplicating myself.
Any suggestions for streamlining any part of this process?
I don’t use Facebook much. Just as a suggestion, I think maybe you could save the posts/content you want to keep throughout the month. There is an option to save the post to read later (Save Post, Save Link etc). After downloading the data you can sort the saved posts during the month (sorted by date). Then, import only the saved posts to a specific directory in Obsidian or create a new vault for each month depending on the amount of content.
At the end of each month, you will download the data and search for saved items (sorted by date). I think there is a directory of saved items in the downloaded file.
You could write a QuickAdd script to parse the data dump. Then the script can assign the HTML content of the files into variables with names, which can be used in a template (default or Templater) to automatically create the note.
I wonder if you could use python to do a lot of the cleaning and splitting into notes? You could have a python script that splits up your content dump, removes a bunch of stuff, the saves out multiple .MD files.
Alternatively, you could consider the manual step of copying as a way of forcing you to filter what goes into your vault to a reasonable level. If you’re generating 4 hours of reading every month, that’s a shit ton of notes! Is that all really useful? How much of that are you going to revisit? You could figure out a way to tag or save posts that you want to copy into obsidian, then batch manually do it later in a mindful way.
It’s definitely a lot! Makes for a lot to work with – but also a lot of value.
Some of it I definitely filter out, but yes, I absolutely revisit much of it. Although part of my challenge is determining what’s entirely new content for my database, and what’s just a paraphrase of something I already have in it (and, therefore, doesn’t add much).
I’m starting with closer to 600 minutes/month, so there’s a lot that’s getting filtered out. (And a little bit of the reading time is probably misleading because it’s all the links back to the original source posts. That’s not exactly “reading material.”)