Hashtags in frontmatter

Yurcee · May 21, 2024, 2:04pm

hi

i used the script on one of my vaults and was working…

possibly, you have a space or two after tags:??

then use:

import os
import platform
import re

# Set root directory based on platform
if platform.system() == "Windows":
    root_directory = r'C:\Users\<username>\Documents\Obsidian\TEST'
elif platform.system() == "Linux":
    root_directory = '/home/<username>/Documents/Obsidian/TEST'
else:
    # Placeholder for macOS
    root_directory = '/path/to/Macintosh/Obsidian/TEST'

# Regex to find the tags section and ensure section ends when a line does not start with '  - '
tags_section_regex = re.compile(r'(tags:\s{0,8}\n(?:\s{2}- .+\n)*)(?=\S)')

# Regex to add # and quotes unless they have been dealt with before
tag_value_regex = re.compile(r'(\s{2}- )(?!"#)(.+)')

# Function to process a single file
def process_file(filepath):
    with open(filepath, 'r', encoding='utf-8') as file:
        content = file.read()
    
    # Find and process each tags section
    def replace_tags_section(match):
        tags_section = match.group(1)
        modified_tags_section = tag_value_regex.sub(r'\1"#\2"', tags_section)
        return modified_tags_section

    new_content = tags_section_regex.sub(lambda match: replace_tags_section(match), content)

    with open(filepath, 'w', encoding='utf-8') as file:
        file.write(new_content)

# Walk through all files in the root directory
for root, dirs, files in os.walk(root_directory):
    for file in files:
        if file.endswith('.md'):  # Assuming markdown files
            process_file(os.path.join(root, file))

print("Processing complete.")

changed tags_section_regex to allow some spaces after tags:

try now if you will!!

thehawk777 · May 21, 2024, 10:51pm

Thanks but still not working. I’m using Windows. I copied your code, changed “C:\Users<username>\Documents\Obsidian\TEST” to my path and double checked it, saved it as tags.py and ran the command in the right folder in powershell, “python tags.py”. I got the message “Processing Complete” but nothing happened. Here’s an example of my frontmatter after I ran the script:
"tags:

inner_life
outer_world
links: Erving Goffman's Front Stage and Back Stage Behavior"

Yurcee · May 21, 2024, 11:59pm

hmmm, the code currently doens’t have any error handling
so difficult to see (for you) what’s going on

i recommend taking the advice of the ai you are using
ask it to add error handling and fix what’s wrong with the code (feet what you are using now to it)
you can be good to go in about 3 mins i think

thehawk777 · May 22, 2024, 12:18am

OK. Thanks for your help.

thehawk777 · May 22, 2024, 1:59am

Eureka! I did what you suggested and took the code to ChatGpt4. It made some minor tweak and it worked!
One more question. Some of the tags do not have " -" before them but are on the same line as “tags:” e.g. tags: tag 1 tag2 tag3 They were left untouched. Could possibly you give me the regex for that?
Here is the ChatGpt code:

import platform
import re

# Function to get the user's home directory
def get_home_directory():
    return os.path.expanduser("~")

# Set root directory based on platform
if platform.system() == "Windows":
   root_directory = r'C:\Users\<username>\Documents\Obsidian\TEST'
elif platform.system() == "Linux":
    root_directory = os.path.join(get_home_directory(), 'Documents', 'Obsidian', 'TEST')
else:
    # Placeholder for macOS
    root_directory = os.path.join(get_home_directory(), 'Documents', 'Obsidian', 'TEST')

# Regex to find the tags section and ensure section ends when a line does not start with '  - '
tags_section_regex = re.compile(r'(tags:\s*\n(?:\s{2}- .+\n)*)(?=\S|$)', re.MULTILINE)

# Regex to add # and quotes unless they have been dealt with before
tag_value_regex = re.compile(r'(\s{2}- )(?!"#)(.+)')

# Function to process a single file
def process_file(filepath):
    try:
        with open(filepath, 'r', encoding='utf-8') as file:
            content = file.read()
        
        # Find and process each tags section
        def replace_tags_section(match):
            tags_section = match.group(1)
            modified_tags_section = tag_value_regex.sub(r'\1"#\2"', tags_section)
            return modified_tags_section

        new_content = tags_section_regex.sub(lambda match: replace_tags_section(match), content)

        with open(filepath, 'w', encoding='utf-8') as file:
            file.write(new_content)
        print(f"Processed: {filepath}")

    except Exception as e:
        print(f"Failed to process {filepath}: {e}")

# Walk through all files in the root directory
for root, dirs, files in os.walk(root_directory):
    for file in files:
        if file.endswith('.md'):  # Assuming markdown files
            process_file(os.path.join(root, file))

print("Processing complete.")```

Yurcee · May 22, 2024, 10:54am

i think it’s better to make the formatting of your tags values standard throughout your vault

now you can use linter plugin to convert your tags arrays separated by a space to the list format that the python script can currently handle

so install linter and in its settings do the following:
in first tab in the settings:

in the second tab find this setting as well and turn it on:

i think this should do it - if not, try looking around in the first two tabs and try switching some more stuff on…

now you can run lint on folder or even full vault from the command palette and the other types of tags will have been converted to list type with dashes and you can run your python script

thehawk777 · May 22, 2024, 12:53pm

OK. That sounds good but the standard formatting I’m looking for throughout my vault is just #tag . Is that not good enough? Your solution will yield that so seems like a good way to go but is there something I’m missing?

Yurcee · May 22, 2024, 12:56pm

you may have more than one tag for a note
it’s better to go by Obsidian standards with multi-line formatting if you have more than one
looks better also than dividing tags by a space

but before you do full-vault linting, keep in mind that it is a potentially destructive action
so depending on the default linter settings, it may do changes you will not want (altho the developer will not have default settings that may do sg that user won’t like, right)
so before doing full-scale lint, you just try linting current file or experiment with linter settings in a test vault

thehawk777 · May 22, 2024, 1:21pm

Many of my tags are not in the frontmatter and I don’t plan to put any more there. This whole thing is just so the ones that are already there are searchable in other programs.

Yurcee · May 22, 2024, 1:23pm

those that are not in the frontmatter are already prepended by a hashtag, so no need to change those

how are getting along with linter?
make the setting changes shown in the screenshots and lint one file where tags follow one another with a space
they will be put below one another nicely and you can run the python script again

thehawk777 · May 22, 2024, 2:17pm

I have used Linter very sparingly in the past. I don’t even have it enabled right now. But I’ll give your plan a shot on a test vault to get more experience with Linter if nothing else.
Since I’m not putting any new taqs in the frontmatter and because I don’t care about how the existing ones look, would it not be simpler to just change the python script and avoid potential prolblems with Linter? It should be pretty easy to edit the script with AI.

Yurcee · May 22, 2024, 2:35pm

my recommendation is to make the tags value format standard throughout the vault
also, regexes to do with unknown number of elements in arrays is a pain
also I’m not sure what the correct formatting would be?
maybe:

tags: ["#glory", "#wonder", "#someothertag"]

personally, i wouldn’t want to go down this road

factory linter should not make unwanted changes in your vault but i felt it necessary to remind you

so my second recommendation is to start with factory-settings linter, make the settings changes shown in the screenshots to do with tags

then go to a note where you have these types of arrays

tags: glory wonder someothertag

Press ctrl+p for command palette, lint current file
you see the changes in the source mode

tags:
  - glory
  - wonder
  - someothertag

if you are satisfied with how linter is working, you can run full-scale linter on all the vault

by the changes you’ve made in the settings, only the tags will be changed the way i’ve just shown

now you can quit obsidian and run the python script as you did before

you should see in source mode now:

tags:
  - "#glory"
  - "#wonder"
  - "#someothertag"

consequently, in the file properties pane you will not see any errors as this is also correct formatting by obsidian standards and achieves what you’ve set out to do

thehawk777 · May 22, 2024, 2:55pm

OK, great. Thanks for patiently explaining all this. I’ll give it a whirl.

Yurcee · May 22, 2024, 3:00pm

no worries, i’m at my computer now, fiddling with my own mess

anything to do with tags, linter is bound to be educational as this is a public thread

the only downside is you all got me to help out :))

thehawk777 · May 22, 2024, 3:12pm

And thankful I did!

thehawk777 · May 22, 2024, 3:13pm

“you all”? You must be in the southern States?

Yurcee · May 22, 2024, 3:18pm

you all, you guys, whatever

english has no standard plural you, so…

so you did it? managed to make it work?

thehawk777 · May 22, 2024, 3:48pm

Haven’t got to it yet but hopefully will later today

thehawk777 · May 23, 2024, 1:25pm

Happy to report that it worked perfectly. I had to change one more Linter setting i.e. toggle on Format Yaml Array on second tab. I hadn’t given up hope that I could do it so really appreciate your perseverance and knowledge. All the best.

Yurcee · May 23, 2024, 1:41pm

glad it worked!!

any other interesting use case you have, do not hesitate to ask

it’ll help other people too

best regards