"Group by" Woes

There’s a lot going on here, and mastering it will require you to learn some of the ways in which Dataview works differently than a traditional database. In particular, it’s worth studying how GROUP BY and FLATTEN work. The docs provide some help, and there are some good posts on the forum that go into detail.

For the following examples I used the same input data:

We can visualize this data with a simple table:

```dataview
TABLE Name, Occupation, Born
```

One complication is that we have a field, Occupation, that can contain an array of values. It would be nice to be able to query the rows with the occupations split out so they’re easier to group. For this we can use FLATTEN.

FLATTEN is a way to create new columns based on existing ones. A key detail is that if the column being flattened is an array, then Dataview will create a new row for each value of the array. In this example, since a person can have more than one occupation, this creates a new row for each person’s occupation.

```dataview
TABLE Name, Occupation, Born
FLATTEN Occupation
```

Now it’s easier to group the rows by occupation, using the GROUP BY command.

However, unlike in regular relational SQL, GROUP BY does not change the layout of the table, but instead replaces the results with a new set of rows, one for each value of the group, each of which contains a single field, “rows”, which is an array of all the original rows in that group.

```dataview
TABLE rows.Name, rows.Occupation, rows.Born
FLATTEN Occupation
GROUP BY Occupation
```

This works, but may not be quite what we’re looking for, and comes with some other complications with regards to sorting each column.

One approach is to not use GROUP BY, but just to put the field in the first column and sort by it:

```dataview
TABLE WITHOUT ID
	Occupation,
	file.link as "Person",
	Born
FLATTEN Occupation
SORT Occupation, Name
```

The approach I usually use is to use a LIST instead of a TABLE, which organizes the data as an outline:

```dataview
LIST rows.Summary
FLATTEN Occupation
FLATTEN file.link + " (born " + Born + ")" as Summary
GROUP BY Occupation
```

I hope this is helpful as a starting point. Good luck!

18 Likes