Regex as an Obsidian tool

Hey all,

(This is my first time posting on the obsidian forum, so if this isn’t the right location for this, apologies in advance.)

I’m p34ch. I’m really passionate about knowledge management and think the Obsidian community is really special (imo, it’s a large portion of Obsidian’s value as a tool). So I’m excited to start interacting with some of you all! :smiley:

Anyways, I wanted to talk about a use case i had for regex today. Maybe it will be helpful to some people here. I think regex works really well when combined with Obsidian. As we all know, one the defining characteristics of Obsidian is that everything is text, which has a lot of benefits (data interoperability, lack of vendor lock-in, etc). One of those text-based benefits is that you can use all the powerful text editing tooling that has existed since way back when (think early unix days).

One of those is regex (for those that aren’t familiar, regex is a way to define a pattern of text that you want to search for. It’s extremely powerful, especially when used to make large-scale refactors to notes). Well, today, I was experimenting with a new workflow for content analysis of blog articles, and I wanted a way to extract all my highlights I made from my annotated markdown file. I searched the community plugin marketplace, and the main plugin for this doesn’t appear to support highlights made with html mark elements.

Luckily, this was extremely easy to do with regex. Basically, I just used regex to find all my mark elements (ie. <mark .*>.*</mark>), selected and copied them all. Super easy. Here’s a gif (note: I opened my vault in IntelliJ, which is where I do a lot of my large-scale text refactoring. I’m sure VSCode has the same feature.)

2023-02-22 14.36.07

In the future, I’d like to post about some use cases I’ve had for more advanced regex features (like numbered capture).

1 Like

I read part 1. This is a very cool idea! I love the concept of using shell commands to format content, then executing the shell command from inside Obsidian (using templater). I saw someone use GPT for this (basically passed in the unstructured text and told GPT to format it as a table). That’d be cool too

I’ve recently been writing a lot of custom JavaScript to automate parts of my text processing pipeline (namely, computing note analytics, which i find useful for weekly reviews and discovering emergent patterns). I’m hoping to put up a blog post soon :smiley:

1 Like

I am a heavy regex user from time to time. However, I do not need or want regex inside Obsidian.

When I am doing the kinds of rewrites regex helps most with I drop into VSCode. All the raw text power I’d want or need.

My time inside Obsidian itself is qualitatively different.

1 Like

Yea, the title might be slightly misleading. “Using regex to do text manipulation on text files you primarily interact with in Obsidian” is more accurate. I actually mention doing the actual regex in an IDE here (and the GIF is IntelliJ):

I also don’t care to have regex in Obsidian, though I don’t think I feel as strongly as you do. What do you mean by “time inside Obsidian itself is qualitatively different”?

When I’m inside Obsidian I’m all about information, concepts, ideas, etc.

When I’m in my text editor I’m about transformation and text management.

1 Like

In Obsidian, I imagine most people will be using ==text== for highlighting, rather that the HTML method.

yea, the highlighter extraction plugin uses that syntax. but i’m using the Highlightr plugin (gives access to more colors).

My attitude is a little different…

IDEs provide specialized text editing capabilities. Some of them would be very out-of-place to include in Obsidian. But others are general-purpose and greatly enhance productivity. These abilities feel on par with ‘being able to type quickly’.

For example, multi-line editing, comes with Obsidian (hold ALT or OPTION, then click). And previously, I shared about leveraging command line integration.

Obsidian for me is a place to think and to operate on information. Information is transformed and not in a state of permanent rest.

When I have a block of text that needs “dicing up”, I choose the fewest steps to extract out the key information. Without these capabilities it wouldn’t be worth the time to do by hand. But with them, it’s quick and worthwhile.

Just my 2 cents :smiley: