CLI script to make batch updates to Obsidian vault metadata

grefft · January 23, 2023, 2:14pm

I just released a CLI I wrote named obsidian-metadata. This script makes it easy to make batch changes to an Obsidian Vault’s metadata (frontmatter, inline metadata, and tags). I was inspired by the work of u/srbd7 on Reddit who posted a similar library a few months ago but I found it didn’t fully meet my needs.

What does it do?

I find myself refactoring the way my Obsidian vault is structured more often than I’d like. I make heavy use of frontmatter and inline metadata to organize my notes with Dataview. Every time I refactored, I was spending too much time doing search/replace operations to change metadata keys and values throughout the vault.

I wrote this script to automate this refactoring. After using it extensively myself, I thought I’d release it to the community. I have many plans to add functionality to this script, for the moment it provides the following capabilities:

in-text tag: delete every occurrence
in-text tags: Rename tag (#tag1 → #tag2)
frontmatter: Delete a key matching a regex pattern and all associated values
frontmatter: Rename a key
frontmatter: Delete a value matching a regex pattern from a specified key
frontmatter: Rename a value from a specified key
inline metadata: Delete a key matching a regex pattern and all associated values
inline metadata: Rename a key
inline metadata: Delete a value matching a regex pattern from a specified key
inline metadata: Rename a value from a specified key
vault: Create a backup of the Obsidian vault

To install, you must have Python v3.10 or above. Then run

pip install obsidian-metadata

Beware, this is early release software. I am new to Python programming and I can’t guarantee that it will work as expected. I’ve tested it thoroughly on my own vault but that does not guarantee it will work as intended on yours. I strongly recommend you use it on a copy of your vault to avoid making any inadvertent changes to your data.

I’d love your thoughts on improving this. You can learn more at GitHub - natelandau/obsidian-metadata: Batch updates to metadata in an Obsidian vault

gapmiss · January 23, 2023, 4:37pm

I have tested your script against a small demo vault and it worked as directed. I’m curious how it would handle a larger vault(1000+) parsing the metadata at load time.

To test a larger vault, this would require editing the config file and changing the path to the vault. Which lead to the question … how difficult would it be to point a different path at the app at runtime or add the ability to manage multiple vault paths? Which then lead me to look at the source code and I saw that you can do just that thru the script’s options.

If you run obsidian-metadata --help you can see all the options available.

--vault-path        PATH     Path to Obsidian vault
--config-file       PATH     Specify a custom path to a configuration file
--dry-run      -n            Dry run - don't actually change anything
--log-file          FILE     Path to log file [default: /path/to/logs/obsidian_metadata.log
--log-to-file                Log to file
--verbose      -v   INTEGER  Set verbosity level (0=WARN, 1=INFO, 2=DEBUG, 3=TRACE
--version                    Print version and exit
--help                       Show this message and exit.

For myself, the multiple vaults capability is the icing on the cake. your script is now integrated into my workflow. Looking forward to slowly testing it out on “real” data and will report any new findings.

Thank you

grefft · January 23, 2023, 5:25pm

@gapmiss Thanks for testing the script and posting your thoughts.

Adding support for multiple vaults should be relatively easy. I only have one vault so it didn’t occur to me to add that functionality. Thanks for the suggestion To your point, you can specify a different vault at runtime using the --vault-path option. That said, I will look into adding support for multiple vaults in the configuration file which will make for a more extensible solution.

As to your question about working with larger vaults, I’ll need to do testing and get back to you. My own personal vault has ~500 notes in it. The script works fine at that level but I’ll spend some time and create a test vault with significantly more data and do whatever work needs to happen to ensure the script works as expected.

grefft · January 26, 2023, 2:25pm

quick followup, just updated the script to v0.2.0 which allows for multiple vaults to be specified in the config file

gapmiss · January 28, 2023, 3:03pm

I did some testing w/ multiple vaults config. The config setup is easy and your UI changes work.

I used a vault w/ ~13K notes in 10 folders. Your script indexed the vault in less than a minute. It felt speedy for such a large vault.

An initial test to rename a tag, which led to additional questions in regard to metadata.

initial metadata
tags: [11ty]

replace metadata
key: tags value: 11ty
action: replace w/ eleventy
expected: [eleventy]
results: eleventy

attempt to reverse back to array
action: replace metadata
key: tags value: eleventy
action: replace w/ [11ty]
expected: [eleventy]
results: '[eleventy]'

How do you handle yaml types? Specifically arrays. What type of markup does your script expect for an array? How does it handle strings, integers?

I know YAML has multiple formats for arrays and doesn’t require double quotes for strings. Is there a YAML markup standard that is used by your code that would answer these questions?

I will continue testing as use-cases arise.

Thank you.

grefft · January 31, 2023, 1:59pm

thanks for the continued testing and questions.

The YAML library I’m using to (re)write the frontmatter is ruamel.yaml . Currently I’m not too concerned about how it writes it’s arrays or quotes its strings which may result in it changing the formatting used in a vault. In it’s current configuration

This

---
key: [1, 2]
---

will become

---
key:
  - 1
  - 2
---

There are a number of obsidian plugins such as obsidian linter which will reformat a user’s frontmatter. Consequently, I am not too focused on this.

Just released an update that allows adding frontmatter. There’s a roadmap issue I’m using to track progress towards v1.0. Many more features to come