Group by not working with a calculated column in Dataview

SSalinasC · February 21, 2023, 11:59pm

Hi guys… I have a note with some data in frontmatter that I want to query. My yaml look like this:

item: Test
registros:

tarea:
fecha: 2023-02-10
horas: 3
tarea:
fecha: 2023-02-11
horas: 5
tarea:
fecha: 2023-02-12
horas: 2

I want to summarize the number of hours spent in the task, grouped by week or month with a very simple query.

TABLE WITHOUT ID 
dateformat(registros.tarea.fecha, "MM") AS "mes",
sum(registros.tarea.horas) AS "total" 
FROM "2023-02-20"

If I query my data without grouping, the results are as expected.

But if I try to group or sort I can’t see the results.

TABLE WITHOUT ID 
dateformat(registros.tarea.fecha, "MM") AS "mes",
sum(registros.tarea.horas) AS "total" 
FROM "2023-02-20"
GROUP BY "mes"

Please, any advice what I’m doing wrong ? Also, why my calculated column name always display “(1)” after name ?

Thanks in advance for your help.
Regards.

Dovos · February 22, 2023, 12:13am

you need to add rows. before the field inside the header after you group.
the small (1) is showing how many rows your table has. it is quite useful to have it with bigger tables (especially if you lag it out by trying to render out 1k+ rows and are wondering why it doesn’t load. (the number already shows before the content inside the table is rendered))

TABLE
sum(rows.registros.tarea.horas) AS "total" 
FROM "2023-02-20"
GROUP BY dateformat(rows.registros.tarea.fecha, "MM") AS "mes"

this code might work. if not then you have to go with a query with flatten of the registros field before you can group it.

holroy · February 22, 2023, 2:45am

As Dovos said, the “(1)” indicates that you’ve only got 1 row in your table. You only got one row since the “mes” column is a list of three values within that one row. To separate that into multiple rows you need to use FLATTEN in some variant. Most likely on the “registreros” field which, if I understood it correctly is the list of tasks.

And FLATTEN is also the solution to why you can’t group or sort. The thing is that in your query the "mes" is nothing more than an alias for the computed column. It’s not an actual column, or known to the rest of the query, if that make sense.

If on the other you had used FLATTEN dateformat(registros.tarea.fecha, "MM") AS Mes at the end of you query, it would be known by the other parts of your query like grouping, sorting and the where clause.

Since your original post is a little misformatted, it’s hard to present a good alternative query, but I tried building a similar field structure, and some queries which kind of evolve into something useful (hopefully).

---
Tags: f54919
registros:
- tarea: nombre de la tarea
  fecha: 2023-02-10
  horas: 3
- tarea: 10th-2
  fecha: 2023-02-10
  horas: 5
- tarea: 10th-3
  fecha: 2023-02-10
  horas: 2
- tarea: 11th-1
  fecha: 2023-02-11
  horas: 1
- tarea: 11th-2
  fecha: 2023-02-11
  horas: 3
- tarea: Enero
  fecha: 2023-01-01
  horas: 17
---
questionUrl:: http://forum.obsidian.md/t//54919

## FLATTEN registros as R

```dataview
TABLE WITHOUT ID R.tarea, R.fecha, R.horas
FLATTEN registros as R
WHERE file.name = this.file.name
```

## Group by R.fecha  - (date)
```dataview
TABLE length(rows.R.fecha) as "Recuento de tareas", sum(rows.R.horas) as Sum
FLATTEN registros as R
WHERE file.name = this.file.name
GROUP BY R.fecha as Fecha
```

## Group by Mes  - (month)
```dataview
TABLE length(rows.R.fecha) as "Recuento de tareas", sum(rows.R.horas) as Sum
FLATTEN registros as R
FLATTEN dateformat(R.fecha, "MM") AS Mes
WHERE file.name = this.file.name AND Mes = "02"
GROUP BY Mes
```

Put all of that into a file of its own, see if you can make sense of the various queries, and evolution of the queries from just listing the values into doing some calculation on the queries.

Two things to remember when looking at these queries

When you’re using just TABLE, it either picks the first column from the file name/link, or from the GROUP BY statement
When you do a GROUP BY, all the matching rows which collated, is collated into the rows object, so you need to use that instead of just the simple variable

And finally, I’ve used the WHERE file.name = this.file.name to limit the query to only this particular note. When you’re ready, you should open it up to test with other files/folders/…

Bonus tip: How to present code properly in a forum post

If you want to showcase either markdown, or code blocks, or dataview queries properly in a forum post, be sure to add one line before and one life after what you want to present with four backticks, ````. This will ensure that any other backticks (like for code blocks) is properly shown.

Disclaimer: I don’t speak spanish, so I’m hoping I’ve not butchered the language too much, when using google translate to suggest some column headings and so on

SSalinasC · February 22, 2023, 8:39pm

Thank you guys… I really appreciate your help, it was very helpful in clearing my doubts. Also my apologies, this was my first post to the forum so I wasn’t aware of the correct format to enter it.

Thanks again and regards from Chile.
Sergio

SSalinasC · February 23, 2023, 3:58am

Hi guys… I want to create a folder with notes, each one for a task where I record the time I spend on it each day, in the format of the previous example.

I tried to move my query to a parent note to show a summary of the tasks in the folder, grouped and filtered by day, week or month. For example:

TABLE dateformat(rows.R.fecha, "MM") AS Mes, sum(rows.R.horas) as Sum
FLATTEN registros as R
FLATTEN dateformat(rows.R.fecha, "MM") AS Mes
FROM "Notas"
WHERE Mes = "02"
GROUP BY Mes

When I run the query, I get the following error: "Dataview: Unrecognized query operation ‘folder’ ". Any idea how to correct my error ?

Thanks again.

anon63144152 · February 23, 2023, 6:42am

Does this work?

```dataview
TABLE 
	dateformat(rows.R.fecha, "MM") AS Mes, 
	sum(rows.R.horas) as Sum
FROM 
	"Notas"
FLATTEN 
	registros as R
FLATTEN 
	dateformat(rows.R.fecha, "MM") AS Mes
WHERE 
	Mes = "02"
GROUP BY 
	Mes
```

SSalinasC · February 23, 2023, 12:34pm

No… When I execute the query I recceive the error I posted before.

Regards.

anon63144152 · February 23, 2023, 12:46pm

For me, your original query creates an error, but the revised version I suggested does work locally in my test vault, even though the query doesn’t have any data to populate with. Hope someone can sort this out for you.

Saludos.

holroy · February 23, 2023, 1:49pm

The solution provided by @anon63144152 should work, since that message is related to the FROM line being in the wrong place. It should always be right after the line with TABLE & co.

Maybe you need to restart Obsidian, close and reopen the file or something like that to trigger the script to be rerun. The suggestion provided should work.

SSalinasC · February 23, 2023, 6:24pm

Thank you guys… you are right, the solution provided by @anon63144152 works like a charm. The error was due to a problem in the yaml with the data.

---
descripcion: Task 1
registros:
  - fecha: 2023-02-10
    horas: 3
  - fecha: 2023-02-10
    horas: 5
---

One last question, assuming a yaml like previous (per note), if I want to display the name of the task and then total the hours per day for that task, what is the correct way to group it to get an output like this:

Task 1    2023-02-10    10
          2023-02-12    11
Task 2    2023-02-10    05

Again, thank you very much for your help.
Regards.

anon63144152 · February 23, 2023, 6:49pm

I wish I knew the answer to that. Hope that @holroy or someone else can help.

holroy · February 23, 2023, 7:04pm

If you look at your previous query asking for the total for each month, and you now want the total for each description (or file), what do you think you need to change? And what result do you then get?

SSalinasC · February 23, 2023, 8:07pm

hi @holroy… Those were examples to understand how a query of this type is built.

As I mentioned, my idea is to create a note per day with a table that query all the tasks worked on that day, the number of hours spent on each of them and with a link to the task note if I need to go to the activity detail.

To be honest, I’m looking for the best alternative to implement that task and time log. Perhaps the way I just described is not the best, so other ideas are welcome, of course.

Regards.

holroy · February 23, 2023, 8:48pm

Well, I’ve described how to get to the next step, so please try that and see what happens.

SSalinasC · February 24, 2023, 8:39pm

Hi @holroy… first of all thank you for your help, I really apreciate it.
I think I’m a little bit closer to what I want to achieve.

TABLE 
    rows.file.link AS Doc, 
    sum(rows.R.horas) AS Sum
FROM "Notas/Test"
FLATTEN registros AS R
FLATTEN R.fecha AS Fecha
GROUP BY Fecha

With this query I get the result I’m showing above. For the “2023-02-10” day, sum is correct but I’m not be able to separate the results per task. Any advice please ?

Thanks in advance.

holroy · February 24, 2023, 9:01pm

You might need to either change into grouping by descripcion, or possibly to flatten something differently.

What’s the content of the QTest and QTest1 file?

The way to work with stuff like this is to try first to get the full table with the correct values (but ungrouped and in separate rows), so maybe something like:

```dataview
TABLE WITHOUT ID
  file.link, R.Fecha, R.descripcion, R.Horas
FROM "Notas/Test"
FLATTEN registros as R
```

Then when looking at that, and if it seems correct, with one line pr item, you can go ahead and start grouping on the various criterias, and sum up the variants.

And do remember that each time you add a GROUP BY you’ll get another level of rows to work with. And this is where stuff is getting hairy, so whilst working with it, keep the variant above as one query. Add another query where you add the GROUP BY and add columns like rows.<previous column name> (or just rows and see a somewhat larger table).

And then finally, when the second table is almost correct, you can add the final layer of GROUP BY and repeat the process of finding the correct names to use. The goal is for each query to extend the syntax, and get the correct information in that query, step by step.

SSalinasC · February 24, 2023, 10:04pm

QTest and QTest1 are two sample notes that I used to test my query. Both have the same format:

## For QTest ##
---
tags: actividad
descripcion: HEALTHCHECK
registros:
  - fecha: 2023-02-10
    horas: 3
  - fecha: 2023-02-11
    horas: 3
  - fecha: 2023-01-01
    horas: 17
---

## For QTest1 ##
---
tags: actividad
descripcion: HEALTHCHECK 1
registros:
  - fecha: 2023-02-10
    horas: 20
  - fecha: 2023-01-15
    horas: 5
---

holroy · March 3, 2023, 5:58pm

I still think you’ve been given enough information so that you should be able to sort this out yourself, but I’m going to try to explain is just once more.

With your files QTest and QTest2 try the following queries in a note of their own:

## Base data

```dataview
TABLE WITHOUT ID descripcion, R.fecha, R.horas
FROM #actividad 
FLATTEN registros as R
```

## GROUP BY descripcion
```dataview
TABLE WITHOUT ID descripcion, R.fecha, R.horas, rows.R.fecha, rows.R.horas
FROM #actividad 
FLATTEN registros as R
GROUP BY descripcion
```

### Limited to a given date
```dataview
TABLE WITHOUT ID descripcion, rows.file.link, sum(rows.R.horas)
FROM #actividad 
FLATTEN registros as R
WHERE R.fecha = date("2023-02-10")
GROUP BY descripcion
```

For me this produces this output:

Notice how I start with a query to ensure I’m seeing the data I would like to see, with one line per registration. Achieved by the FLATTEN registros as R line.

Next I group on what I want to group on, e.g. descripcion, and notice that my R.fecha has moved into a list of rows.R.fecha and so on. But you also said you wanted to check for a given date, and that needs to be done before the sum (and grouping), so the third version does that, ands back in the link of the file where the description was.

Do however note the following. Making the query like this kind of limits you to only showing one date, and you’re also limiting each file to only have one description.

So I still think the format I suggested earlier on, using something like:

---
registros:
- tarea: Healthcheck
  fecha: 2022-02-10
  horas: 3
- tarea: Healtcheck 1
  fecha: 2022-02-11
  horas: 10
---

Is a superior format, as it’ll give the option to have different tasks in the same file, whilst keeping a connection to the description, and it’ll make the queries slightly easier, as you’ll not need to group/flatten/re-group/re-flatten to do some of the queries which you indicate you want to do.

Given a format like the last example, you could get a query and result like the following:

```dataview
TABLE length(rows.R.fecha) as "Recuento de tareas", sum(rows.R.horas) as Sum
FLATTEN registros as R
WHERE registros
GROUP BY R.tarea
```

With the result:

Do however, note that since this is grouping on descriptions potentially across multiple files, there is no easy way to link to the origin file of any given registration. But that is most often the case when you opt to group results together potentially from multiple files.

You can still add something like rows.file.link to this query, and get the list of the files. However, that might again list multiple registration from any given file multiple times, so I would have most likely switched to a dataviewjs query to loose those duplicate listings.

system · March 13, 2023, 3:30pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.