Help wanted with refactoring links

jonr · November 16, 2022, 9:22am

Hello,
I wasted a fair bit of time during the year creating & maintaining a glossary for my first year of study in my degree. Towards the later part of the year I came to learn that I could link directly to the macOS dictionary app via [term](dict://term). I also managed to successfully install a medical terms dictionary into the app.

My study vault now contains a mixture of links to my own glossary & those to the dictionary app. I’d like to standardise them all to the dictionary app. This will require refactoring links from [[_terms/Term|term]] to [term](dict://term).

Without the pipe & alias I could probably achieve this with search & replace that changes [[_terms/Term]] to [term](dict://term) but the pipe & alias complicates things. If I could replace |term]] with nothing using regex I could do that but I would need to restrict it to only links to the _terms subfolder, otherwise it’ll ruin other normal inter-note links.

Am I screwed?

holroy · November 16, 2022, 9:46am

Could you elaborate on what platform and which editors you’ve got? It shouldn’t be that hard to write a regex todo this for you given some proper tools.

If you in addition give some actual examples of terms to be replaced and their replacement I reckon someone will be able to assist you.

I-d-as · November 16, 2022, 9:51am

Yea. It definitely seems likely it would be possible. I used a regex search and replace to do the reverse here: Note Composer: links to blocks and headers should be updated when extracting - #10 by I-d-as

jonr · November 16, 2022, 9:53am

Hi, I’m just reading up on capture groups in VS Code. I’m on macOS and usually use VS Code for search & replace.

Some examples are (they’re all words, no special characters or symbols or numbers):
[[_terms/Pyrexia]]
[[_terms/Thrombophilias|thrombophilias]]
[[_terms/Miasma|miasmas]]

holroy · November 16, 2022, 9:56am

Does the new link type accept both uppercase and lowercase terms, or do they need to be preserved?

What does your regex look like, so far?

(Btw, I’m also on mac using VS code for search and replace)

jonr · November 16, 2022, 10:01am

I’m very much a beginner at regex, just reading through @I-d-as linked post above.
Dictionary links are case-insensitive. All entries in the dictionary for non-proper-nouns are lower case. Both the below work;

[eukaryotic](dict://eukaryotic)
[eukaryotic](dict://Eukaryotic)

jonr · November 16, 2022, 10:05am

My search pattern seems to be picking up all links

I-d-as · November 16, 2022, 10:07am

I’m thinking this could be accomplished in 2 steps. First, to go through and replace the |term part of all links that start with [[_terms/ with nothing. But it is important that all your links are properly closed with ]], otherwise it could be destructive. Then the second step would be doing the regex search and replace using capture groups. I want to double check that it works before I post what I’m thinking.

holroy · November 16, 2022, 10:12am

It shouldn’t be a problem to do this in just one step. If the regex is specific enough, and you don’t necessarily do replace all, it should be safe to do.

jonr · November 16, 2022, 10:15am

I have a backup
But I don’t understand, if you do it in two steps as you describe, how to determine |term' in my glossary from |aliasin my normal links. The path_terms/` is the differentiator.

anon63144152 · November 16, 2022, 10:17am

Is there a way to get [term](dict://term) links to work on iOS?

I-d-as · November 16, 2022, 10:17am

Maybe search \[\[terms\/(.*)\|.*\]\] and replace with [$1](dict://$1) the first time around thus taking care of those with pipes. Then the second round search \[\[terms\/(.*)\]\] and replace [$1](dict://$1) but this is just a guess. On my phone and can’t really test.

jonr · November 16, 2022, 10:19am

this doesn’t fuss me. I have access to my vault on my phone but don’t really use it for anything strenuous. I always have my laptop

anon63144152 · November 16, 2022, 10:21am

Was being entirely selfish and asking for my own needs.

jonr · November 16, 2022, 10:26am

OMG!

Regex is \[\[_terms/(.*)\|(.*)\]\]
Replace is [$1](dict://$1)

jonr · November 16, 2022, 10:27am

I just tested it on my phone and no bueno, link doesn’t respond

I-d-as · November 16, 2022, 10:30am

So are you saying it kind of works but perhaps the capitalization is causing issues?

holroy · November 16, 2022, 10:37am

So here goes nothing:

search for: \[\[_terms/([^\|\]]*)(?:\|[^\]]*)?\]\]
replace with: [\l$1](dict://\L$1)

As an image:

What this aims to do is search for something which in order is:

\[\[_terms/ – Always starts with this text
([^\|\]]*) – a capture group of anything up until either a | or a ] bracket character, properly escaped (which makes the regex read a lot harder… )
(?:\|[^\]]*)? - an optional non-capturing group with the lead-in character of | followed by anything until the next ] character
\]\] – always ending with a double bracket

When it comes to the replacement part it’s simpler:

[\l$1] – Place the first capture group within the brackets as the name of the link, with the first character lowercase, \l
(dict://\L$1) – And then the actual link also with the first capture group, with the entire group lowercased, \L

Feel free to interchange or use whatever to lowercase, either both or none of them. Just wanted to place them there to show some options.

Preserving the case of your original term would be a little harder, as one one need to change the capture group to be the latter part of the regexp. It is possible but harder. The proposed solution might arise a few issues when your search term is at the start of a sentence, and you do want it to have an uppercase character at front.

A final note on the usage of character classes, i.e. [^\|\]]* and [^\]]*, instead of the ordinary wild card, .: I tend to use this more specific approach to limit my regex so they don’t get overly greedy, and expand into other cases which could occur on the same lines and so on. Just a matter of precautions, which I’ve gotten accustomed to.

Hope this helps,
Holroy

anon63144152 · November 16, 2022, 10:38am

Indeed.

Saw your first post: thrilled.

Tried it on my MBA: even more thrilled.

Tried it on my iPhone: %$£#

Searched Google: nada.

Was hoping you or someone in the thread might know how to get this to work across OS platforms.

But still fairly thrilled. Grateful for the tip / share. Useful as the MBA is my main device.

Very neat regex sorcery. Squirrelled that away for a rainy day.

I-d-as · November 16, 2022, 10:40am

Since it is somewhat relevant here, you might be interested in this document I made: Regular Expressions in Obsidian (copied from mdn web docs)