I’m guessing there’s an existing plug-in that does this (probably with AI…) but for the life of me I can’t find something that specifically addresses this use case. Please correct me if I’m wrong!
I have a mountain of messy text that I brought over to Obsidian. Mostly sloppy web clips from other tools/platforms that include SEO garbage, header/footer text, nav text, garbage image links and tags, etc. I’ve been trying to work through the problem manually for a long time, but at this point I would need to take a leave of absence from work, send my kid to live with a relative, and take out a loan to survive long enough to get through it all
SO: I’m looking for a plugin (AI or not) that can reformat these inconsistent and sloppy clips into a consistent format, strip out the obvious garbage (in the same way a browser does with Reader Mode), and ideally extract and standardize some of the metadata (origin, link, author, date) that exists in most of these files (albeit inconsistently placed and rendered). Bonus points for AI tools that can suggest tags based on content etc.
I’m less concerned than many Obsidian forum users about AI privacy considerations, since this is primarily content clipped from the open web, but I’m not so oblivious that I want to let 100 different AI plugins loose on my vault while I look for something that can do this.
Has anyone found a method for automating the clean up of messy web clips?