Dataview query for all tasks in folder with custom status

I’ve been reading the Dataview docs and reading the forum, but am now stuck and would appreciate some help/sample code for the next step.

What I’m trying to do

Ultimately, I want to be able to create a table with 1 line per folder, and progress bars representing the percentage of tasks completed in all notes in each folder. I’ve broken this down into being able to query tasks across notes in a folder, and querying for custom statuses (since I use a custom status for complete with the Tasks plugin which is DONE but not represented with [x] ).

Things I have tried

Looking at some posts on the forum, I came across this post which was very useful. However, in trying to adapt this to my situation, I can’t seem how to resolve the two issues mentioned above. In particular, the docs for task metadata in dataview seems to not allow for telling whether a task is complete with a custom status or not.

Also, in trying to adapt the sample code from the linked forum post, the output table has an entry for each different file in the folder following the WHERE keyword. Is there a way to condense this down to just one entry per folder to get a folder-level progress bar?

Thanks!

Update: I think I figured out how to handle the custom status problem. As a proof of concept, I modified the linked code from the original forum post, checking for status = “A” since that’s my custom status for success.

I thought then it might be as simple as adding a “GROUP BY” at the end to get folder-level progress bars, but that doesn’t work. Here’s the modified code attempt:

TABLE file.tasks.text, length(file.tasks.text) as Total, file.tasks.completed, filter(file.tasks.status, (t) => t = "A") as C, length(filter(file.tasks.status, (t) => t = "A")) AS Completed, (length(filter(file.tasks.status, (t) => t = "A")) / length(file.tasks.text)) * 100 AS BB, "<progress value='" + (length(filter(file.tasks.status, (t) => t = "A")) / length(file.tasks.text)) * 100 + "' max='100'></progress>" AS Progress
FROM "Test"
WHERE file.tasks
GROUP BY file.folder

Any ideas on getting the grouping to work properly? The query returns successfully, but now the progress bars are empty and no tasks are being counted as completed.

1 Like

Whenever you do a “GROUP BY” all the stuff you previous referred to as someField are now lists within rows.someField. And when someField is a list, this gets complicated in combination with stuff like filter and similar functions. Some of this issues can be countered by doing sum(rows.someField), but it’s a lot of trial and error to verify the results in either case.

I’ve compiled two variants for your request, one which FLATTEN all the tasks before grouping and calculating the various results, and one which is a little more similar to your original case without FLATTEN on the tasks.

Further more the two variants as slight differences in how to calculate the completed and uncompleted count, based on how you want to group queries.

Using FLATTEN on tasks

The FLATTEN query
```dataview
TABLE Count, Open as "` `", InProgress as "/", Complete as "x", Canceled as "`-`", UncompleteCount as UnC, CompleteCount as C, 
  join(rows.item.status, "|") as Statuses, 
  "<progress value='" + (UncompleteCount * 100 ) / Count + "' max='100' />" as "Uncompleted Progress",
  "<progress value='" + (CompleteCount * 100)    / Count + "' max='100' />" as "Completed Progress" 
FROM "ForumStuff/f4x"
FLATTEN file.lists as item
WHERE item.task and contains(list(" ", "/", "-", "x"), item.status)
GROUP BY file.folder
FLATTEN length(rows.item.text) as Count
FLATTEN length(filter(rows.item.status, (s) => s = " ")) as Open
FLATTEN length(filter(rows.item.status, (s) => s = "/")) as InProgress
FLATTEN length(filter(rows.item.status, (s) => s = "x")) as Complete
FLATTEN length(filter(rows.item.status, (s) => s = "-")) as Canceled
FLATTEN Open + InProgress + Canceled as UncompleteCount
FLATTEN Complete as CompleteCount
```

Output using FROM "ForumStuff/f4x" in my test vault:

The columns signify the count of the various statuses, named in the column header. InC and C indicate the Incompleted Count and Complete Count. Also note, that in this query I limited to only the “normal” task statuses: space, /, - and x.

In this query I count the various statuses up into their own variable, and then I sum the various statuses into another variable which is used to produce the progress bar. This is very easy to use and change into various groupings depending on whether you want to include canceled tasks into completed or uncompleted task count.

Notice that in both cases I’ve joined up the various statuses into a string separated by |, so if there are many spaces, there are also many |'s.

Not using FLATTEN on tasks

The non FLATTEN query

It’s kind of incorrect to say non-flatten query, as I’m still using FLATTEN to gather up stuff into variables, but it doesn’t flatten the file.tasks before grouping, similar to your original query.

```dataview
TABLE 
  Total, Incomplete, Complete,  
  join(sum(rows.file.tasks.status), "|") as Statuses, 
  "<progress value='" + ( Incomplete * 100) / Total  + "' max='100'></progress>" AS "Incomplete Progress",
  "<progress value='" + ( Complete * 100)/ Total + "' max='100'></progress>" AS "Complete Progress"
FROM "ForumStuff/f4x"
WHERE file.tasks 
GROUP BY file.folder
FLATTEN length(sum(rows.file.tasks.status)) as Total
FLATTEN length(filter(sum(rows.file.tasks.status), (t) => t = " " or t = "/")) as Incomplete
FLATTEN length(filter(sum(rows.file.tasks.status), (t) => t = "A" or t = "x")) as Complete
```

Which for the same folder in my test vaults now outputs:

In this query I selected just a few statuses to count, and used or to choose multiple of them. Could/should probably used contains(list(...), t) similar to how I selected which to focus on in the first query. Leaving that as an exercise for the reader… :smiley:

So in this query if the status is space or / it’s counted as incomplete, and if it’s x or A it’s counted as complete. One can also see that the status column is more diverse than in the first query since I opened up the query somewhat. (And also that the total doesn’t really match, as some of the tasks in my test set isn’t just the normal status variants)


In summary, the trick to getting this to work is to use FLATTEN ... as someVariable to collate calculated values, and that when using GROUP BY all the previously fields are now gathered under the rows.SOMETHING lists. And finally, that since we are grouping on files, we also need to do sum(rows...) in some cases to loose a level in the lists introduces by the GROUP BY.

I’ve not added a ton of explanation to these queries, but feel free to ask if there is something in particular being especially unclear.

Wow, this is awesome, thanks! I tested it out (with some small edits) and it seems to work perfectly.

I just had a quick question: I can see the columns in each of the two examples are a little different, but what makes one the “FLATTEN example” and the other the “non-FLATTEN example” given they both use FLATTEN commands? From what I can tell in the outputs, they seem to be computing similar things.

Also, one bit of syntax that’s a little confusing: in the second example, you have

FLATTEN length(filter(sum(rows.file.tasks.status), (t) => t = " " or t = "/")) as Incomplete

Why is the filter being applied after summing over the rows? In my programming experience, this doesn’t seem to make sense. It seems like sum is actually returning a list of all statuses across all relevant files?

The naming of my queries, which was somewhat bad, refers to the fact that the “FLATTEN example” does a FLATTEN file.tasks, whilst the second doesn’t. And they should function almost identical, but they work in slightly different matters when it comes to how handle a single task, or counting stuff.

One of these differences comes related to having to resort some trickery like the sum(...) in the second part. I’ll try to explain it with the following example of having three files with two tasks in each of the file.

Using the default theme, and these queries:

```dataview
TABLE file.tasks.text
WHERE file.folder = this.file.folder
  AND startswith(file.name, "File")
```

```dataview
TABLE rows.file.name, rows.file.tasks.text
WHERE file.folder = this.file.folder
  AND startswith(file.name, "File")
GROUP BY true
```

```dataview
TABLE rows.file.name, sum(rows.file.tasks.text)
WHERE file.folder = this.file.folder
  AND startswith(file.name, "File")
GROUP BY true
```

We now get this output:

The first query, lists the two tasks related to each file within the same row (since we’ve not flattened the file.tasks). When we then use GROUP BY true (or similar) it’ll add these lists from each file into another list, which we can see by the double dots in the second query output.

The sum function is used to reduce this list of lists, back into a single list of tasks, as shown in the third query. It’s the same happening in your query with the exception that there is so much other also going on. But the sum() is used to reduce the list of lists into a single lists, which is then easier to do the filter operation on, as we don’t need to reach into another list status (which is just painful to do).

Lastly, as can be seen in the previous query some trickery was needed to achieve that single lists of tasks when not flattening the file.tasks. If we do use the FLATTEN file.tasks on the other hand, we get the following queries:

```dataview
TABLE file.name, task.text
FLATTEN file.tasks as task
WHERE file.folder = this.file.folder
  AND startswith(file.name, "File")
```

```dataview
TABLE rows.file.name, rows.task.text
FLATTEN file.tasks as task
WHERE file.folder = this.file.folder
  AND startswith(file.name, "File")
GROUP BY true
```

Which outputs as

And here we see that in the first query, without grouping the result, we get just one task per row, but the file.name is doubled up in the two rows, since we’ve got two tasks per file.

When grouping this start list, we get the rows.task.text at the wanted one-level list, easily accessible for filter and company.


In summary, in most cases I’ll use the FLATTEN on stuff like file.lists or file.tasks to make it easier to work with later on. If there are some reasoning demanding that I can’t FLATTEN the list in the first place, then I’ll use sum(...) to remove one level of the list before continuing to work with that list.

I presented both version to you, to give you the choice of either alternative, but in most cases I would use the FLATTEN version, until it’s proven I can’t use it due to other demands of the surrounding queries.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.