Dataview Sums and Average

Things I have tried

I’m trying to calculate the total amount and average amount spent during the month on certain activities.

I’ve been following the instructions in this thread and this but without any success.

I’ve tried 2 different syntaxes:

TABLE sum(rows.minutes) as “:black_circle: Tot”,
length(rows.minutes) as count,
sum(rows.minutes) / length(rows.minutes) as “:black_circle: Avg”,
min(rows.minutes) as “:black_circle: Min”,
max(rows.minutes) as “:black_circle: Max”,
FROM “Archive :card_file_box:/Journal :date:/Days”
WHERE MONTH = [[2023-Jan]]

and

TABLE without ID
sum(rows[“minutes”]) as “:black_circle: Tot”,
length(rows[“minutes”]) as count,
sum(rows[“minutes”]) / length(rows[“minutes”]) as “:black_circle: Avg”,
FROM “Archive :card_file_box:/Journal :date:/Days”
WHERE MONTH = [[2023-Jan]]

Which both return the same type of error:

image

which doesn’t make sense to me since the syntax seems correct.

Anyone willing to help?
Thank you

What I’m trying to do

@dugdun, @mnvwvnm I believe the fact that I’m working with inline fields (minutes::slight_smile: might be involved, but you surely know better

Hi @paoloap, thanks for posting.

There is a syntax error in your query – commas aren’t allowed after the last item before FROM.

But maybe we could start with how your data is defined. It would be helpful if you could post some examples of your source data. I can see that your query is expecting a “minutes” field, but I’m not sure how you’re using it your journal pages.

For this post, I’m going to make an assumption that each of your journal pages you have a single minutes field in the YAML:

minutes: 10

or perhaps in the body text:

minutes:: 10

If that’s the case, then a query like this would give you the count, total, average, min, and max for these fields:

```dataview
TABLE WITHOUT ID Count, Total, Average, Minimum, Maximum
FROM "Inbox"
WHERE minutes
GROUP BY ""
FLATTEN length(rows.minutes) as Count
FLATTEN sum(rows.minutes) as Total
FLATTEN Total / Count as Average
FLATTEN min(rows.minutes) as Minimum
FLATTEN max(rows.minutes) as Maximum
```

(Of course, you should replace the FROM "Inbox" and WHERE minutes with clauses that make sense for your vault.)

Here’s what that looks like in my test folder:

2 Likes

Hey Craig,

thanks for looking into this and proposing a detailed solution. To confirm your assumption, yes I do collect this data as in-body text, in the form of minutes:: 10. However, the count, total, avg. etc. field don’t exist in the daily note where I capture “minutes”.

This might seem trivial, but I believe it’s worth clarifying since copy-pasting the query provided didn’t yield the expected result:

TABLE WITHOUT ID Count, Tot, Avg, Min, Max
FROM "Archive 🗃️/Journal 📅/Days"
WHERE MONTH = [[2023-Jan]]
GROUP BY ""
FLATTEN length(rows.minutes) as Count 
FLATTEN sum(rows.minutes) as Tot
FLATTEN Tot / Count as Avg
FLATTEN min(rows.minutes) as Min
FLATTEN max(rows.minutes) as Max

Returns this error message:

I’m also not familiar with the flatten operator

The error you’re seeing is due to the fact that your FROM clause is including pages that don’t have a “minutes” field in them. I recommend adding a “WHERE minutes” to narrow the query to just pages with data.

Try this:

```dataview
TABLE WITHOUT ID Count, Tot, Avg, Min, Max
FROM "Archive 🗃️/Journal 📅/Days"
WHERE MONTH = [[2023-Jan]]
WHERE minutes
GROUP BY ""
FLATTEN length(rows.minutes) as Count 
FLATTEN sum(rows.minutes) as Tot
FLATTEN Tot / Count as Avg
FLATTEN min(rows.minutes) as Min
FLATTEN max(rows.minutes) as Max
```
3 Likes

Here is a post that goes into detail about how the GROUP BY and FLATTEN operators work. I hope you find it helpful.

2 Likes

I tried this before posting my previous reply but got the same error message

Could you post your exact query that’s causing the error?

1 Like

sure, here it is

TABLE WITHOUT ID Count, Tot, Avg, Min, Max
FROM "Archive 🗃️/Journal 📅/Days"
WHERE MONTH = [[2023-Jan]]
WHERE minutes
GROUP BY ""
FLATTEN length(rows.minutes) as Count 
FLATTEN sum(rows.minutes) as Tot
FLATTEN Tot / Count as Avg
FLATTEN min(rows.minutes) as Min
FLATTEN max(rows.minutes) as Max

My best guess from the error message is that the sum() operation in the following line is encountering a null value:

FLATTEN sum(rows.minutes) as Tot

This would be caused if one or more of the source pages had a null or missing value for minutes. But I don’t see how that’s possible, since the WHERE minutes clause should have filtered out any pages with a null or missing minutes field.

I’m feeling a bit stumped here – that same query works perfectly in my vault. The only changes I made were to substitute a different folder for the FROM clause and remove the WHERE MONTH... clause.

I suggest looking at the source data pages. Do any of them have unusual values for the minutes field?

EDIT: Also, are you running the latest version of the Dataview plugin? I’ve seen a few problems clear up on upgrade.

1 Like

I did used used a different naming for “minutes” in the past.

I therefore made a new folder to test the query on notes that only contain “minutes.”

Good news, we’ve got a different error message!

image

That does feel like improvement. :slight_smile: At least the query isn’t erroring out now.

Would you be willing to post an example of what a page of your source data looks like? That might help us track down what’s going on here.

1 Like

Sure, this is the template of my daily note
image

The highlight is temporary, meaning that I erase it as I input the right data at the end of the day.

Hmm, I thought perhaps the highlight might be the issue, but that gives a different error. I’m not sure what we’re doing differently that it works for me but not for you. :frowning:

1 Like

Sorry to interfere, but there is most likely one or more of the minutes which hold some strange value, so maybe try this query and look for anomalies.

```dataview
TABLE WITHOUT ID min(rows.minutes)
FROM "Archive 🗃️/Journal 📅/Days"
WHERE MONTH = [[2023-Jan]]
WHERE minutes
Group by minutes
```
1 Like

no interference here :grinning:

actually I had some entries like “-” changed them to 0 and actually the query provides a different error message

image

To provide more context, everything work smooth when I query the individual dates without making any sum or average

So now it seems like the sum() function to generate the Average generates the error.

I got another trick up my sleeves to detect the anomaly in your data set. Could you please try the following:

```dataviewjs
const result = await dv.query(`
TABLE minutes
FROM "Archive 🗃️/Journal 📅/Days"
WHERE MONTH = [[2023-Jan]]
WHERE minutes
`)

if ( result.successful ) {
   const noNumbers = dv.array(result.value.values)
    .where( v => typeof(v[1]) != "number")
    .map( v => [ v[0], v[1], typeof(v[1]) ] )

   dv.table(["File", "field", "type"], noNumbers)
} 
else
  dv.paragraph("~~~~\n" + result.error + "\n~~~~")
```

With a contrived test set of mine, I got the following output:

Where all my noNumbers were:

  • the first entry was a minutes:: "660", aka string
  • the second entry was multiple minutes:: definitions
  • the third had the value ==660==, aka highlighted
  • the fourth was a random link?!

What do you get running that query on your data set?

1 Like

Please excuse me if I make any stupid mistakes because I have no knowledge of the dataviewjs language.

I tried the query on both folders, (A) containing missing and “-” values for “minutes” and (B) containing only actual values ie. numbers or 0

This is the result I’ve got for A

image

and for B
image

all this dates contain 0 as a value, ie:

image

Is this helpful?

It showcases whether you’ve got any faulty data. So you should be able to run the main script on “A” since that didn’t find any non numbers. What was the result whennrunning the main script on “A”?

On “B” however, you need to go through every file in the list and change those entries having the “*” as value for minutes, as thoseb files will trip up the main script. After cleaning up (and getting a “no result” there as well), then what do you get when doing the main script for “B”?

1 Like

Good catch! I agree with @holroy , it looks like you have some pages where the minutes value isn’t a number, and that could be what’s causing the script to fail.

1 Like