Scripts for auto-creating backlinked files and removing unlinked placeholders

I have a couple of scripts-in-progress that I figured I’d post here, since I’ve seen requests for this functionality.

The first creates placeholder files for any backlinks without one.

The second moves placeholder files without backlinks into an orphans folder.

In theory, the two together should allow Roam-ish “page appears when you link it, goes away if no real content and not linked” type functionality.

First, caveats.

Right now I know these work on Mac Catalina with homebrew bash, using Mac grep (which is different than Linux grep). I tried to avoid bash 4isms you won’t have available in Apple’s crippled 3.x bash, but haven’t looked for incompatibilities yet. The script probably works under zsh too if you change the shebang. Because this is Mac, everything is treated case-insensitive since the filenames are.

Nothing is parameterized yet, so you have to edit the scripts for top-of-file variables (this reflects that they’re actually intended for KM and Hazel triggered scripts where you can configure ad hoc in-edit-box). The code isn’t exactly polished and nothing is linted. I tried to put in bumpers around things that would ruin our notes, but can’t stress enough, these are work in progress and still rough.

I won’t explain much further how to use them because if you can’t work it out pretty quickly they’re not ready for you yet. Look at the top of file for options and the end-of-line comments for what it’s doing.

TL;DR: you’re brave if you use these at this early stage, but I did try to make sure they won’t break your stuff and they’ll give up rather than do so. No guarantees though.


Here’s make-obsidian-backlinks.bash

This creates a placeholder file for any links that don’t seem to have one.

The big thing to know about it is that it is not really compatible with relative links, at least yet, and possibly never

It will most definitely work best in the one-big-directory model, though it’ll basically work if you have a two-level top/subdir set and only link down.

If you have a link called [[bar]] it’ll look for bar.md anywhere in the Archive, and create it at Archive root if it’s not there.

If you have [[foo/bar]] it’ll look for a bar.md file under a foo directory anywhere in the Archive. If it doesn’t find it, it’ll create foo/bar.md from the Archive root. It currently will not create the foo directory, so don’t link to folders you don’t already have.

If you have [[…/bar]] or [[…/bar/foo]] or anything with “…/” in it (note this is two dots, the forum shows it as three), the link is ignored by the script. Once you have parents in the path, the link could mean almost anything in the tree. In the next version I’ll do a liberal check for whether the note file exists anywhere in the tree, as if you [[bare]] linked it, but for now it’s skipped.

I really don’t recommend using relative paths or subfolders with it at all. However, I think the only risk is it creates a placeholder file at Archive root, and you have to move it into a subfolder yourself.

#!/usr/bin/env bash
# Create empty notes for anything linked

archive=${OBSIDIAN_ARCHIVE:-$HOME/Dropbox/Sync/Obsidian/Muninn}
ext=md
placeholder="#placeholder"

cd "$archive"

# note this is one long pipeline
grep --include '*.'$ext --exclude-dir '.obsidian' -orh '\[\[[^]]*]]' . |  # get all links with brackets
grep -oh '[^][]\{1,\}' | # remove the brackets
grep -oh '[^|]*' | # remove the aliases
sed 's/^ *//;s/ $//' | # trim spaces
sort | uniq | # make unique
while read -r file; # for each expected file
do
    if [[ "$file" == *../* ]]  # if it contains a parent ../ link in it anywhere we can't process it
    then
        echo "'[[$file]]' has a parent directory in its path and will not be processed"
    elif ! find . -ipath "*/$file.$ext*" | grep . > /dev/null # if it's not in the notes tree
    then
        echo "$file.$ext does not exist and is being created"
        echo "# ${file##*/}" > "$file.$ext" # create with an H1 placeholder of its own name
        echo "$placeholder" >> "$file.$ext" # add any custom placeholder text, plus an eol
    fi
done

Here’s remove-obsidian-unlinked.bash:

This moves files it identifies as not linked anymore into an orphans folder in the archive root.

By default this will only move files that have #placeholder in them as a tag (per the other script), or which only have headers and blank lines (how I created placeholders manually). See the criteria near the top.

Like the other script, it’s pretty liberal about guessing whether something is linked because path lookup magic and I wanted to err on the side of caution.

If the file is called bar.md, any link to a “bar.md” will cause the file to be left alone, even if it would have resolved to a different directory in the notes tree. For this script, the parent directories are completely ignored and it pretends all the titles are in a big bucket.

Again, this will work best in a one-directory or shallow setup, though this should actually be compatible with any tree. Like the other script, it’ll just skip more candidates if you have name collisions.

#!/usr/bin/env bash
# Remove any note not linked

archive=${OBSIDIAN_ARCHIVE:-$HOME/Dropbox/Sync/Obsidian/Muninn}
orphaned=orphaned
ext=md
include_empty=true # remove value for false
noisy= # remove for false

# Include all notes not in journal that are either empty or have the #placeholder tag
candidate_criteria="--exclude-dir 'journal' -e '#placeholder'"

# Include all notes with no non-blank lines not starting in #
inverted_criteria="--exclude-dir 'journal' -e '^[^#]'"

status=0
linktmp=$(mktemp)

cd "$archive"

# this is one long pipeline
grep --exclude-dir '.obsidian' -orh '\[\[[^]]*]]' . |  # get all links with brackets
grep -oh '[^][]\{1,\}' | # remove the brackets
while read -r link
do
    link=${link##*/} # remove any parent directories
    link=${link%%|*} # remove the alias
    link=$(echo $link | sed 's/^ *//;s/ $//') # trim
    echo "'${link,,}'" # make lowercase and requote
done |
sort | uniq > "$linktmp" # make unique

get_files() {
    # all note files, along with criteria above
    [ "$candidate_criteria" ] && eval "grep $candidate_criteria --exclude-dir '$orphaned' --exclude-dir '.obsidian' --include '*.$ext' -rl ."

    # all note files not meeting inverted criteria above
    [ "$inverted_criteria" ] && eval "grep $inverted_criteria --exclude-dir '$orphaned' --exclude-dir '.obsidian' --include '*.$ext' -rL ."

    # all empty-length files from clickthroughs with no content
    [ "$include_empty" ] && find . -iname '*.'$ext'*' -not -ipath './.obsidian/*' -not -ipath "./$orphaned/*" -size 0
}

# now, for each note fitting the candidate criteria, also one long pipeline
get_files | sort | uniq | while read -r filename
do
    title="${filename%.$ext}" # remove extension
    title="${title##*/}" # remove any parent directories
    title="'${title,,}'" # make lowercase and requote
    [ "$noisy" ] && echo "Processing '$filename', reducing to title $title..."

    # if title is not in list
    if grep "$title" "$linktmp" > /dev/null
    then
        [ "$noisy" ] && echo "'$filename' is probably linked"
    else
        # can we move it?
        target="$archive/$orphaned/$(basename "$filename")"
        if [ ! -e "$target" ]
        then
            [ "$noisy" ] && echo
            echo "'$filename' is not linked. Moving into '$orphaned'..."
            mkdir -p "$archive/$orphaned"
            mv "$filename" "$target"
            [ "$noisy" ] && echo
        else
            status=1
            echo "'$filename' is not linked, but a file by that name is already in '$orphaned'"
        fi
    fi
done

# clean up
rm "$linktmp"

exit $status

Some early changes will be moving these away from being pure pipeline and doing something better than grepping a tmp file once per note (though caching should make that faster than one would expect). These some pretty hilarious code structure in there that came down to I tried to stick with vanilla bash and vanilla grep.

I’ll post updated versions as I polish them and test them under other launchers like KM and will edit the post to note that they’ve changed. If you run into something horrible, post here so nobody else gets tripped up, but no promises for fixes from me at least in any amount of time.

Please copy your notes tree before trying them.

Hope these help someone out!

UPDATE [8/11 9:55P] added support for [[link|aliases]] and better handling for multi-word titles

UPDATE [8/12 11:58A] corrected instructions for first script–forgot I had added whole-tree path search. :roll_eyes:

1 Like

One other caveat/bug I just realized:

These currently won’t work with link aliases. It’ll try to treat |alias as part of the backlink. I’ll make that correction tonight.

Edit: I’ve updated the scripts to include alias support and to trim [[ extra space ]] format jank that I’m not sure Obsidian even allows. I also fixed some bugs around multiword titles (and I’m sure have more to go, bash quoting is fun).

But How to apply this?

1 Like

id love to see more of this, and a smoother application of it.
but i like that start

1 Like