Counting & Grouping by Child Tags

Hello! I am new to Obsidian / coding and I’ve been trying to scour for a solution. I think I’m close, but I can’t seem to figure out this last step.

What I’m trying to do

I am using Obsidian as a dream journal and am trying to use dataview and tags to see patterns. I want to create a table that counts the number of Child Tags that are utilized throughout the journal. In this instance, I am trying to count the number of times specific characters appear. The issue I am having is that it is counting every combination of Child Tags as “1.”

“Character/Coworker” = 1
“Character/Coworker” and “Character/Nephew” = 1

When it SHOULD be:
“Character/Coworker” = 2
“Character/Nephew” = 1

Things I have tried

I have tried looking through the user guide and I found similar topics on grouping by regular tags, which works for tags without children, but am having difficulties with the child tags. I do use Tag Wrangler but would like to eventually incorporate this into data I can collect by the month in their respective categories that only pulls from the journal section.

The below is what I have as of now. Any assistance or insight would be appreciated!

    length( as Count 
FROM #Character AND "03. Dream Journal"
FLATTEN array(
    filter(file.etags, (t) => startswith(t, "#Character/")),
) as Character
GROUP BY Character

I’m a little unclear on what you really want, that is if you want to count every nested tag in your vault, or just the second level of the #Character/... tags, so let us present options for both! :slight_smile:

First a note on file.tags vs file.etags

If in a file you’ve defined: tags: simple, not/so/simple, it’ll return different stuff into file.tags and file.etags, and it’ll be as shown in this list:

  • file.etags : #simple, #not/so/simple
  • file.etags : #simple, #not/so/simple, #not/so, #not

The file.tags contains all the separate levels of a nested tags in addition to the full tag. However, if you look at the tags of an items that resembles the file.etags and only contains the full tags.

Also note that file.frontmatter.tags contains the raw value of the tags properties.

Example on differences in tag metadata

As an example I searched for the #f53315 tag in my test vault and got the following output:

Look especially on the third row for “None of the above” where there are defined three tags in the actual properties of that file, and there file.etags has those three entries, but file.tags is expanded into five entries.

Regarding the first file, none of the tags was defined in the properties, just in the body of the text.

Counting nested tags in your vault

Using the following query we’re singling out the nested tags of your vault, and counting any that appear more than once. And in case of a really large vault, it limits to the first 100 of those sorted by frequency.

LIST count
WHERE contains(file.etags, "/")
FLATTEN file.etags as tag
FLATTEN length(rows) as count
WHERE count > 1
SORT count desc, upper(key) asc

In this version I’m using file.etags as I wanted to count the actual tags used, and not those extra generated by file.tags, but feel free to try that option too.

Hopefully the query is more or less self-explanatory. Ask if there is something unclear about it.

Example run of this query

The top part of this query when run in my test vault:

Note especially how the sorting on upper(key) means that the #nested/tag and #NewYork is more logically placed (at least according to my logic)

Modifying your script to focus on the second level

When you do FLATTEN array( filter( ... )) as A you keep the filtered list as a list (or array). This means that if your filtering returns a, b, and another row returns just a, and you then group, these will get two different entries. The original purpose (I reckon) was to split a list into its separate entries, and the syntax above can be very useful in a lot of contexts, but in this context I believe you do want it split into the separate variants.

So given that you don’t have anything like #Character/Coworker/Fred, the following might be what you want:

    length(rows) as Count 
FROM #Character AND "03. Dream Journal"
  regexreplace( filter(file.etags, (t) => startswith(t, "#Character/")),
    "#Character/(\\w+)", "$1" )  as Character
GROUP BY Character

I’ve just removed the array( ... ) and done a slight reformatting. :slight_smile:


Your last solution is exactly what I wanted, ugh it was right there! Thanks so much for breaking it down!!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.