Dataview query for invalid dates

What I’m trying to do

Search for invalid dates in YAML

I have YAML of the form

---
doctype: 
description: 
startdate:
tags: 
---

Any of these fields might be blank
I need to search for pages which have a startdate which has been entered but where that date is invalid. e.g
startdate: 2023-11-36
or
startdate: cat

It is easy enough to pull up pages with no date at all. However

TABLE 
	dateformat(startdate, "yyyy-MM-dd") AS Start, 
	description
FROM ""
Where (startdate = null)

pulls up only blank dates but not entered but invalid data. How to do this.

Things I have tried

Spent a long time searching dataview docs and various fora but am still scratching my head

One thread you didn’t find in your search is this reply:

In there I list some related information on why this is hard to do in a DQL query, and how you need to use dataviewjs to actually locate the ones without values. The DQL version (which will not list empty values) for your specific use case:

```dataview
TABLE date, typeof(date)
WHERE date AND typeof(date) != "date"
```

The proper way to use this, using dataviewjs, is a little more complex, and looks as follows:

```dataviewjs
const pages = dv.pages()
  .where( p => p.hasOwnProperty("date") )
  .mutate( p => {
   if ( p.date ) {
     p.typeOf = (
       typeof( p.date ) == "object" ? 
       p.date.constructor.name :
       typeof( p.date ) 
     )
   } else {
     p.typeOf = "null"
    }    
  })
  .where(p => p.typeOf != "DateTime" )
  
dv.table(["Note", "date", "typeOf(date)"],
   pages.map( p => [ p.file.link, p.date, p.typeOf ] ) )
```

The trickery related to .mutate looks a bit deeper into the type of objects in an attempt to discover the pure DateTime objects, and it adds its findings into a new field, p.typeOf. This in the used in the next line to eliminate those which are true dates.

In a simple test setup of mine this results in the following output:

If you find it more natural for you, feel free to change the p.typeOf = "null" into p.typeOf = "empty" or some text which make more sense to you. And of course, replace any occurence of p.date with p.startdate or whatever date field you want to test against.

1 Like

Thanks hugely for the explanations.

In fact the DQL code you use does pick up the following

startdate: 1999 (detected as a false date)
startdate: cat (detected as a false date)
startdate: 1999-01 (not detected as a false date)
startdate: 1999-01-55 (not detected as a false date which it clearly is!!!)

There is well established RegEx which can detect whether a date is a valid date - the above suggests to me that dataview is determining data type using a fairly crude pattern – most likely just nnnn-nn-nn. Less importantly it also seems inconsistent to accept a year pattern only as a invalid date, but yyyy-mm as valid (depending on the definition of date I’d want neither)

There is some debate related to stuff like this, but dataview does accept the first as a date, and under the hood changes it to 1999-01-01. This allows for month queries to be executed a lot easier.

Regarding the second variant, it’s also accepted as a date, although it displays as 1999-02-24. I reckon the reasoning behind that is to allow for stuff like “1999-01-31 + 24 days” to be easily done by doing “1999-01-55” (aka just adding the days to the day part of the date).

Is this good or bad? It’s very convenient in some situations, but it does also lead to some strangeness like accepting that suspicious looking date of 1999-01-55.

Dataview very clearly states what it accepts as a date here, and that is that the year and month are required, but anything after that is optional, but it must follow this pattern:
YYYY-MM-DDTHH:mm:ss.nnn+ZZ (where the T is a literal “T”) (aka the ISO 8601 pattern)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.