Losing the whitespace inside markdown URLs

What I’m trying to do

So, moving my notes out of another notetaking app, some of the resource links have whitespace in them, breaking them.

I’ve had a little play around in VSCode getting a regex to replace all the blank spaces, but it’s proving quite the headache (no variable lookbehinds in the find in files).

Anybody found a quick solution for this before I decide to write a Python script to run through my files?

So this is where I got today with this, had some weird moments (the odd file being emptied) but more or less this did the job:

md_url_pattern = r'(\[(.+)\])\(([^\)]+)\)'

def remove_spacing(match_obj):
    if match_obj.group(3) is not None:
        print("Match Object: " + match_obj.group(1) + "(" + re.sub(r"\s+", "%20", match_obj.group(3)) + ")")
        return match_obj.group(1) + "(" + re.sub(r"\s+", "%20", match_obj.group(3)) + ")"


this_folder = '<your notebook root>'
note_path = '<your directory>'
full_path = os.path.join(this_folder, note_path)
directory = os.listdir(full_path)
os.chdir(full_path)

for file in directory:
    open_file = open(file, 'r')
    read_file = open_file.read()
    read_file = re.sub(md_url_pattern, remove_spacing, read_file)
    if not read_file:
        print("Empty file!")
    else:
        write_file = open(file,'w')
        write_file.write(read_file)

This didn’t pick up brackets inside the brackets, as well as double spacing was transferred into a single space - so I had to go and rename the files.

If I could be bothered, I’d make a loop to get the script to loop through the sub directories in a directory at some point, for now I’m just getting the job done.

Hope this helps somebody at some point. Looping through your files and replacing the text can be pretty destructive, so make sure you have a backup!

There’s a bit more chat here about various regex replace options. If you’re running PC, Notepad++ would be a decent option to do the find/replace.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.