Python script to use OpenAI model to identify similar tags for

Hi all!

Have you ever found tags in your vault accidentally mean the same thing ? Like #rice_dish, #rice_dishes, #rice etc? I wrote a python script that can automatically discover that for you!

I wrote a script that takes all your tags and cleans them up, so an OpenAI embeddings model can understand them. Then it uses the embeddings model to create embeddings for them. Then it works out which pairs of tags have embeddings close to each other. By seeing which embeddings are close together, you can identify duplicate tags, or tags that accidentally mean the same thing.

full explainer on blog here: How to use AI to clean up your tags in obsidian (with python code) – Evolving Impact

Code here: GitHub - Thomas-Richardson/obsidian_tag_similarity

1 Like