I just released a CLI I wrote named obsidian-metadata. This script makes it easy to make batch changes to an Obsidian Vault’s metadata (frontmatter, inline metadata, and tags). I was inspired by the work of u/srbd7 on Reddit who posted a similar library a few months ago but I found it didn’t fully meet my needs.
What does it do?
I find myself refactoring the way my Obsidian vault is structured more often than I’d like. I make heavy use of frontmatter and inline metadata to organize my notes with Dataview. Every time I refactored, I was spending too much time doing search/replace operations to change metadata keys and values throughout the vault.
I wrote this script to automate this refactoring. After using it extensively myself, I thought I’d release it to the community. I have many plans to add functionality to this script, for the moment it provides the following capabilities:
in-text tag: delete every occurrence
in-text tags: Rename tag (#tag1 → #tag2)
frontmatter: Delete a key matching a regex pattern and all associated values
frontmatter: Rename a key
frontmatter: Delete a value matching a regex pattern from a specified key
frontmatter: Rename a value from a specified key
inline metadata: Delete a key matching a regex pattern and all associated values
inline metadata: Rename a key
inline metadata: Delete a value matching a regex pattern from a specified key
inline metadata: Rename a value from a specified key
vault: Create a backup of the Obsidian vault
To install, you must have Python v3.10 or above. Then run
pip install obsidian-metadata
Beware, this is early release software. I am new to Python programming and I can’t guarantee that it will work as expected. I’ve tested it thoroughly on my own vault but that does not guarantee it will work as intended on yours. I strongly recommend you use it on a copy of your vault to avoid making any inadvertent changes to your data.
I have tested your script against a small demo vault and it worked as directed. I’m curious how it would handle a larger vault(1000+) parsing the metadata at load time.
To test a larger vault, this would require editing the config file and changing the path to the vault. Which lead to the question … how difficult would it be to point a different path at the app at runtime or add the ability to manage multiple vault paths? Which then lead me to look at the source code and I saw that you can do just that thru the script’s options.
If you run obsidian-metadata --help you can see all the options available.
--vault-path PATH Path to Obsidian vault
--config-file PATH Specify a custom path to a configuration file
--dry-run -n Dry run - don't actually change anything
--log-file FILE Path to log file [default: /path/to/logs/obsidian_metadata.log
--log-to-file Log to file
--verbose -v INTEGER Set verbosity level (0=WARN, 1=INFO, 2=DEBUG, 3=TRACE
--version Print version and exit
--help Show this message and exit.
For myself, the multiple vaults capability is the icing on the cake. your script is now integrated into my workflow. Looking forward to slowly testing it out on “real” data and will report any new findings.
@gapmiss Thanks for testing the script and posting your thoughts.
Adding support for multiple vaults should be relatively easy. I only have one vault so it didn’t occur to me to add that functionality. Thanks for the suggestion To your point, you can specify a different vault at runtime using the --vault-path option. That said, I will look into adding support for multiple vaults in the configuration file which will make for a more extensible solution.
As to your question about working with larger vaults, I’ll need to do testing and get back to you. My own personal vault has ~500 notes in it. The script works fine at that level but I’ll spend some time and create a test vault with significantly more data and do whatever work needs to happen to ensure the script works as expected.
attempt to reverse back to array
action: replace metadata
key: tags value: eleventy
action: replace w/ [11ty]
expected: [eleventy]
results: '[eleventy]'
How do you handle yaml types? Specifically arrays. What type of markup does your script expect for an array? How does it handle strings, integers?
I know YAML has multiple formats for arrays and doesn’t require double quotes for strings. Is there a YAML markup standard that is used by your code that would answer these questions?
The YAML library I’m using to (re)write the frontmatter is ruamel.yaml . Currently I’m not too concerned about how it writes it’s arrays or quotes its strings which may result in it changing the formatting used in a vault. In it’s current configuration
This
---
key: [1, 2]
---
will become
---
key:
- 1
- 2
---
There are a number of obsidian plugins such as obsidian linter which will reformat a user’s frontmatter. Consequently, I am not too focused on this.
Just released an update that allows adding frontmatter. There’s a roadmap issue I’m using to track progress towards v1.0. Many more features to come