Regex help

sweven · February 26, 2023, 3:16pm

Hi All,

I’m importing a massive doc of vocab words & definitions and I’d like to structure it for better readability in Obsidian. I have almost no regex experience

The old setup

Every vocab word starts on a new line followed by a space, a hyphen, and another space. (But not every new line starts a new word.)

Here’s an example of 3 word entries:

Before

How I want it to look after regex

Thank you!

willasm · February 26, 2023, 4:14pm

For search use ^(.+) - (.+)
and for replace use #### $1\n$2

sweven · February 26, 2023, 4:20pm

Thank you so much. You’ve saved me many hours of manual labor!

holroy · February 26, 2023, 4:36pm

Do note that that regex might goble up a lot more than what you potentially want.

Like what would be the result after doing that regex on:

word - another - Something?
This is another test - of that regex. Should this be picked up?
And what about $ 5 - 3 = 2$, is an example of elementary math.

You might want to reconsider, and choose a slightly more stringent regex.

holroy · February 26, 2023, 4:43pm

Something like ^(\S+) - (.+) is a whole lot safer, see regex101: build, test, and debug regex The link shows the regex in action, and displays what it matches, or not.

This matches against non-whitespace characters at the start of the line. In other words, a word not consisting of any whitespace characters (like space, tab, newline, … ).

This would still match whimsical “words” like =")#( - Some combination of non-whitespace characters, but that’s a lot less likely to occur, than a random dash in the midst of a sentence/paragraph.

willasm · February 26, 2023, 4:45pm

I agree with holroy. The example I gave you will work fine for the example you provided, but it most likely would produce unexpected results if this were applied to your entire vault.

system · May 27, 2023, 4:46pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.