Dataview tags report

What I’m trying to do

I’m trying to create a report of sorts that ideally looks through all the folders and subfolders (or a specific folder / sub-folders) and returns a list of all the tags that were found, the count, and the files they were found in. End goal here is for me to see what tags are showing up the most in all my notes. If a tag is mentioned more, it’s more important.
In a perfect world I’d also be able to drill down on the note to that tag reference.

Things I have tried

I found this snippet online and it’s close but I’ve not been able to figure out how to modify it so it shows the tag, the number of times that tag is mentioned, and the files it’s mentioned in.

TABLE
		length(rows.file.name) as numfiles
	, join(rows.file.link, ", ") as files
FROM
		#docker or #kubernetes 
flatten
		file.tags as tag
group by
		tag

I’ve been trying to read up on dataview syntax but not quite clear on even everything in this little snippet above. I’d appreciate any help on this!

This query should work:

```dataview
TABLE WITHOUT ID tag, length(rows) as "\#", slice(rows.file.link, 1, 5) as "Appearances"
FROM "<INSERT FOLDER PATH>"
flatten file.etags as tag
GROUP BY tag
SORT length(rows) DESC
```

Key points to understand.

  1. In Dataview, the basic unit is a note. By default, a query makes one record for each note. In the flatten clause, we expand the collection set so that theres a record for each tag-note pair. (e.g Suppose we have two notes. Note 1 has 3 tags, “A”, “B”, “C”. And Note 2 has 4 tags “Red”, “Orange”, “Yellow”, “Green”. Then after the flatten, the record set has 7 records: 3 from Note 1: “1-A”, “1-B”, “1-C”; and 4 from Note 2: “2-Red”, “2-Orange”, “2-Yellow”, “2-Green”.)
  2. This query uses file.etags rather than file.tags. file.etags gets the explicit tags, only those tags explicitly described in the note. In contrast, file.tags creates a possible substring prefixes of mentioned tags. (e.g. if a note contains a tag “First/Middle/Last” then file.tags will include 3 entries: “First”, “First/Middle”, and “First/Middle/Last”.
  3. We group by tag. In Dataview, you can access the set/subset of records belonging to each group with the rows context variable. length(rows) gives us the total number of notes in each group (tag).
  4. The rows variable also does “swizzling”, which basically is a user-friendly interface/notation to quickly get at/drill into the same property/field across all the items in a record array at once. We use this do get all the file links in the group at once. Then use slice() to select only the first 5 items (Indexing starts at 0).

Hope this helps!

Thanks for the query and the detailed explanation!
For some reason, it doesn’t seem to be counting the tags in the note. I’m not sure if it’s only finding the first instance but it does show all the tags used in the note, but the number for all of them is 1 despite several tags showing up multiple times. Any ideas? The main goal is to sort those tags that are used most at the top to give them more attention.

Yes, file.etags (and file.tags) only returns the unique tags mentioned in a note. To my knowledge, there’s no way with the basic DQL Dataview to count every occurrence of a tag within a note. Although, this should be possible with dataviewjs. Using the Obsidian API you can get all tag occurrences within a note using app.metadataCache.getFileCache(<FILE>).tags.