This command-line script iterates through the current folder, hunting for all Markdown files that contain a certain substring (a tag, in the example). It then concatenates all files into a single file outside of the vault (in the user’s home directory in this example).
It precedes each file’s content with the filename, converted into title case (as a Markdown H1).
#!/bin/bash
tag="#book/ThinkingFastAndSlow"
find . -type f -name "*.md" -print0 | xargs -0 grep -l "$tag" | while read -r file; do
if [ "$file" != "$0" ]; then
title=$(basename "$file" ".md" | awk '{for(i=1;i<=NF;i++) $i=toupper(substr($i,1,1))tolower(substr($i,2))}1')
echo "# $title"
cat "$file"
printf "\n\n"
fi
done > ~/concatenated_files.md
EDIT: I’ve converted the script into Ruby. It now generates a simple table of contents at the top of the file.
require 'pathname'
def title_case(title)
title.split.map(&:capitalize).join(' ')
end
# Set the tag to search for
TAG = "#book/ThinkingFastAndSlow"
# Find all files that contain the tag
files = Dir.glob("./*.md").select do |file|
File.read(file).include?(TAG)
end
# Sort files alphabetically
files.sort!
# Create the table of contents
toc = "# Table of contents\n\n"
files.each do |file|
# Remove the tag and .md extension to get the title
title = Pathname.new(file).basename(".md").to_s.sub(TAG, '').gsub('_', ' ')
# Add the title to the table of contents
toc += "- #{title_case(title)}\n"
end
# Concatenate the files into a single document
content = ''
files.each do |file|
# Remove frontmatter, tags, and aliases from file content
text = File.read(file)
text.sub!(/^---\n.*?---\n/m, '')
text.gsub!(/#{TAG}/, '')
text.gsub!(/\[\[(.+?)\|.+?\]\]/, '\1')
# Remove .md extension to get the title
title = Pathname.new(file).basename(".md").to_s.sub(TAG, '').gsub('_', ' ')
# Add title as heading and file content to final document
content += "# #{title_case(title)} \n\n#{text}\n---\n\n"
end
# Combine the table of contents and document
output = "#{toc}\n---\n\n#{content}"
# Write the output to a file
File.write(File.expand_path('~/concatenated_files.md'), output)