Calculate time worked from note

I keep a journal of blocks of time I am actively working during the day. I use a template that looks like this:

<div class="pomo">[START:: {{date}}T{{time}}]
[END:: {{date}}T{{time}}]

My date format looks like YYYY-MM-DD and my time format looks like HH:mm:ss, note that the pomo template includes T. So at the end it looks like

<div class="pomo">[START:: 2024-01-26T15:18:19]
[END:: 2024-01-26T15:18:19]

I would like to get a calculation of how much time I actually spent working during the day. Since there are multiple START and END in a file I get a list of each when I query them.

I have tried

where =

But this gives the error “No implementation found for ‘array - array’” which makes sense.
I have also tried

where =

which will give me the correct value, but only for row 0 of course.

Is there a way to either convert the lists into columns in a table so I can do sum(END-START) or to iterate through them to access each index?
Note: They end up in the correct order in the lists when they are collected, so I shouldn’t need to try to match them on any key for the converting to columns option.

What you’re really asking about is how to zip together the lists of START and END, and sadly that’s not possible currently using dataview. However, we can get a little creative since in this case the lists you want to combine are related to each other.

For the query below to work there are two requirements which needs to be met:

  1. None of the work periods can’t overlap. That is, the next job can’t start before the previous has ended
  2. We can’t verify you’ve got a corresponding set of start and end times, so you’ll be in charge of checking that manually

Given these two requirements we can do a FLATTEN flat(list(START)) as startTime which does two things. Firstly it ensures, using flat(list(...)), that even though you only had one work period in any given day, it’s considered a list of start times. Secondly it splits each element, using FLATTEN ... as startTime, of that list into a new row in our table.

Digression on not doubling the FLATTEN operation

Intuitively one would want to FLATTEN both the START and END list, but consider a file with the following content:

  • start: [1, 3, 5] — end: [2, 4, 6]

After we do a split on the start we end up with this list:

  • (start: [1, 3, 5]) — end: [2, 4, 6] — startTime: 1
  • (start: [1, 3, 5]) — end: [2, 4, 6] — startTime: 3
  • (start: [1, 3, 5]) — end: [2, 4, 6] — startTime: 5

If we now continue doing a secondary FLATTEN ... as endTime on the END list we end up with this list:

  • (start: [1, 3, 5]) — (end: [2, 4, 6]) — startTime: 1 — endTime: 2
  • (start: [1, 3, 5]) — (end: [2, 4, 6]) — startTime: 1 — endTime: 4
  • (start: [1, 3, 5]) — (end: [2, 4, 6]) — startTime: 1 — endTime: 6
  • (start: [1, 3, 5]) — (end: [2, 4, 6]) — startTime: 3 — endTime: 2
  • (start: [1, 3, 5]) — (end: [2, 4, 6]) — startTime: 3 — endTime: 4
  • (start: [1, 3, 5]) — (end: [2, 4, 6]) — startTime: 3 — endTime: 6
  • (start: [1, 3, 5]) — (end: [2, 4, 6]) — startTime: 5 — endTime: 2
  • (start: [1, 3, 5]) — (end: [2, 4, 6]) — startTime: 5 — endTime: 4
  • (start: [1, 3, 5]) — (end: [2, 4, 6]) — startTime: 5 — endTime: 6

And we don’t really know which startTime belongs to which endTime.

Since we didn’t double the flattening, and we now have a startTime and a list of END times, we can do a little trick to get the corresponding endTime. Due to our requirements it corresponds to the minimum end time which is after our current start time. Given the example from the digression, given startTime: 3 and end: [2, 4, 6], the corresponding endTime would need to be 4, since that is the first (or minimum) larger than our start time.

In a query this can be achieved by doing a filtering on times larger than then start time, and then getting the minimum of that filtered result:

FLATTEN min(filter(flat(list(END)), (f) => f > startTime)) as endTime

Final query with comments

So using this trickery, we can calculate the difference in time, using just endTime - startTime, and to get the sums we need to group on something. This could be whatever you want, but either GROUP as Day or GROUP true as Total, and we end up with this query:

TABLE sum(rows.workTime) as "Total workhours"
WHERE file.folder = this.file.folder
FLATTEN flat(list(START)) as startTime
FLATTEN min(filter(flat(list(END)), (f) => f > startTime)) as endTime
FLATTEN endTime - startTime as workTime

Which with my test setup displays as:

(And a small language lesson at the end; “time” means “hour” in Norwegian, and you can most likely guess that “minutt” equals “minute”, and the plural is adding of either “r” or “er” :smiley: )

Also note that if you want to do GROUP BY true, you might also want to consider doing a TABLE WITHOUT ID to loose that first column.

Thank you for this very comprehensive explanation. I didn’t expect there would be so much more to this, I thought I was on the brink of finishing it myself but I never would have figured out the rest of this on my own.

1 Like

How would I modify this to get the daily average for all files in the folder, to get a week report?

To clarify, this currently gets the sum of the time entries. I can then group by day. If I change the first line to average(rows.workTime) it will average the individual entries. I want the total for the whole day averaged over the week.