Convert HTML img tags to Markdown

What I’m trying to do

I have a lot of documents that I imported from Joplin that have embedded HTML for images that were downloaded via web clipper with the document. For example:

<img width="314" height="263" src="../_resources/11.01B-Finder_chaos_v2--1_vfx-da_49e7dbbcaf8046338.png" class="jop-noMdConv">

I would like to clean these up so that they are just standard Obsidian Flavored Markdown:

![[11.01B-Finder_chaos_v2--1_vfx-da_49e7dbbcaf8046338.png|314x263]]

Some of the documents also have internal header references like:

[Press & news updates](#news)

And

[](#)[](#)**KLAUS SCHULZE forum topics**

That I’d like to get cleaned up.

Things I have tried

I’ve searched / read through the Obsidian documentation. I’ve done normal web searches, and I’ve searched the forums…and haven’t come up with a solution that really fits my case.

Is there a command that I can call that will allow me to convert and/or cleanup these kinds of issues as I find them? Or is there a community plugin that can accomplish this?

I have over 1100 documents, and 1500 images from my import. I am trying to get all of this cleaned up so I can determine which images are really not linked to anything, and I can go ahead and cleaned / reformat documents as I go through them.

– George

Hello, a thousand files, don’t even try to do this by hand. Your best bet is to use the Linter plugin in Obsidian (using “Custom Regex Replacement”) or, if you’re comfortable with it, open your vault folder in VS Code and do a global search-and-replace using regular expressions. <img.src=“../_resources/([^”]+)".> can grab those filenames and swap them into ![[$1]] in seconds. Once they’re in that standard format, you can use a plugin like “Clear Unused Images” to safely nuking the 1,500 images that aren’t actually being used.

Thanks for this suggestion – I was hoping to be a bit more programmatic about this without having to write my own sed/awk script (yes, I am comfortable enough with regex’s) to do this… The part that I wanted to avoid was trying to script a parser for the tags that would handle the width and height parameters – which aren’t always present.

But, I think the idea of starting with a grep over the files to find all the tags is a good starting point. Then I can determine how many have width/height parameters, and how many don’t.

And thanks for mentioning “Clear Unused Images” - that will definitely help witth cleaning.