Sorting a table in DataView

Alethe · October 8, 2023, 8:17pm

Hi.

I’m a perfect newbie in Obsidian, although I’ve been working on a Zettelkasten for a few years now. In comparison, Obsidian is opening up a whole universe of possibilities: simple text files analyzed by frightfully powerful features is an ideal combination. No more will my thinking be prisoner of inscrutable data structures, as it was with another software.

I’d like for now to exploit the metadata analysis features of DataView, starting with a very simple query: I have lots of articles quotes, in their own subfolder, all tagged with “#Article”. Another metadata field, “Source”, identifies the newspaper or news outlet the article is from.

YAML sample entry, with irrelevant data omitted:

---
Created: <date>
Modified: <date>
title: <title>
tags: [<many other tags>, "#Article"]
Source: Washington Post
---

I’d like to create a table of news sources, sorted by decreasing number of occurrences, just to have an idea of my news diet. Sounds despairingly simple, but I’m still stumbling.

Here’s my DataView code so far :

TABLE length(rows.Source) AS "Count"
FROM #Article
SORT Count DESC
GROUP BY Source

The table output does lists sources and their number of occurrences correctly, but by alphabetical order of source, not descending count order, as expected.

What am I doing wrong? I feel like an utter imbecile…

All plugins are up to date.

Thank you in advance for your help!

Alethe

ush · October 8, 2023, 9:36pm

Try:

TABLE length(rows) AS "Count"
FROM #Article
GROUP BY Source
SORT length(rows) DESC

Here’re some additional explanation based on my current understanding (correct me if I’m saying something wrong!):

According to the documentation, rows contains all of the pages that belong to each group. So you don’t need to append .Source to get the size of the group.
By length(rows) AS "Count", you can re-label the length(rows) column as “Count”. But the scope of this new label “Count” is limited to the first line of the query (if I understand correctly), so you can’t reuse the “Count” label in another line. This is why SORT length(rows) DESC, not SORT Count DESC.
A DQL query is executed from top to bottom, line by line. So if you want to sort the groups that you made in the line GROUP BY ..., the SORT command must appear after the GROUP BY line.

Alethe · October 8, 2023, 9:55pm

That works like a charm.

Many, many thanks, ush, for your help, and especially your explanations. I understand DataView better now!

holroy · October 27, 2023, 6:21pm

This is correct when you use it on the first line, or the column definition line that is. You could often use it in a FLATTEN statement. Then you would be allowed to use other places in your query. So rewriting your query using this knowledge, we would arrive at:

```dataview
TABLE Count
FROM #Article
GROUP BY Source
FLATTEN length(rows) as Count
SORT Count DESC
```

A final comment related to executed from top to bottom, is that it’s mostly correct, but the first line with TABLE ..., LIST ..., and so on, knows what happens throughout the query, so in my mind I kind of think that as the last line to be executed. This is makes it legal to do stuff like I did above, as when you come to the end of the query, it does know about Count due to the FLATTEN command slightly above…

system · January 25, 2024, 6:22pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.