Dataview return unique list in table column

What I’m trying to do

I have a dataview of my notes with a column returning linkt to outgoing links in the “person” folder. Sometime the same person is mentioned multiple times and hence the dataview contains duplicated entries for that note. I want to have unique items in the list after the filter.

TABLE type, summary, filter(file.outlinks, (o) => contains(o.file.folder, "Person")) as person
FROM [[]] and -#TaskExclude and -"templates" and -"cases"
sort file.name desc

Currently there is no way within a DQL query to just get the unique values of a list. There exists some trickery related to a combination of FLATTEN and GROUP BY, which can be seen in this two responses:

I’m not sure whether it can be translated into your use cases, though as it most likely would garble up if that same person is listed in other rows, and that should of course be kept as is, I reckon.

Experimental solution using DQL

I’m not sure if this would work, but it seems like what you would want to be unique is the combination of type vs person, so here is some untested query stuff:

```dataview
TABLE aType, aSummary, uniqPersons
FROM [[]] and -#TaskExclude and -"templates" and -"cases"
FLATTEN  array(filter(file.outlinks, (o) => contains(o.file.folder, "Person"))) as person
FLATTEN map(person, (p) => [file, p]) as uniqCombo
GROUP BY uniqCombo
FLATTEN key[0] as aFile
FLATTEN key[1] as aPerson
GROUP BY aFile
FLATTEN rows[0].type as bType
FLATTEN rows[0].summary as bSummary
FLATTEN array(rows.aPerson) as uniqPersons
SORT aFile.name desc
```

If this query actually works, I should go a buy myself a ticket. :smiley:

The gist of it is to create a uniq combination of the file and each related person, flatten on that combination, and then immediately group on that combination to only be left with one of each file-person combination.

The rest of the query is an attempt to regain control over the fields for a proper output, and this is where I’ve most likely have screwed up somewhere.

Using dataviewjs

I think I would prefer to do this type of thing using dataviewjs, where I then would have started of with the same DQL query, but modified the value of the person list to only contain a unique value within that list.

The reason behind me not being able to test run the previous query, and this one is that it’s hard to build up a matching test-set to replicate your data. This time however I might do a little testing, as I can emulate the query, so I’m feeling slightly confident that the following could work:

```javascript
const result = await dv.query(`
  TABLE type, summary,
     filter(file.outlinks, 
        (o) => contains(o.file.folder, "Person")) as person
  FROM [[]] and -#TaskExclude and -"templates" and -"cases"
  sort file.name desc
`)

if ( result.successful ) {
  const values = result.value.values

  dv.table(
    result.value.headers, 
    values.map(r => [
      r[0], r[1], r[2], // Keep these as is
      r[3].filter( (value, index, array) => array.indexOf(value) === index)
    ])
  )
} else
  dv.paragraph("~~~~\n" + result.error + "\n~~~~")
```

What happens here is that we execute the query within the javascript environment, and if it’s succesful, we display the result using dv.table() with one vital change, we make the fourth column (aka r[3]) be unique by filter out only elements where the current index matches the first time you find the value within the array.

Try it out, and see whether this works.

Wait for unique() to be implemented in Dataview

There has been some discussion related to this on the github of Dataview:

And it highlights a PR to implement something like this. It possibly also shows how (or why) my code potentially doesn’t work, since we’re working with links and not simpler structures, so it might be we need to revisit how to make the person list unique…

Thank you for the very detailed answer.
The examples you provided did not return the desired results.

I realize its very difficult to answer such a question without access to the vault where the query runs.

The vault in which this query runs has the following structure with minimal examples of files. In the caseA.md file is the dataview query I try to optimize
structure of vault:

daily
  -2023-21-03.md
cases
  -caseA.md
persons
  -Jane.md
  -John.md

2023-21-03.md

type:: note
summary:: an example note
refnr:: [[caseA]]

- [[Jane]] tells a story [[John]] is taking note.
- [[John]] tells a story [[Jane]] is taking note.

caseA.md

refnr::[[caseA]]

## dataview of notes with referece to the case

```dataview
TABLE type, summary, filter(file.outlinks, (o) => contains(o.file.folder, "persons")) as person
FROM [[]] and -"cases"

Jane.md

email:: [email protected]

John.md

email:: [email protected]

Now that I’ve got some test data to work with, I could correct an error in the latter FLATTEN variant, so now the query looks like this:

```dataview
TABLE WITHOUT ID aFile.link, bType, bSummary, uniqPersons
FROM [[]] and -"cases"
FLATTEN array(filter(file.outlinks, (o) => contains(o.file.folder, "persons"))) as person
FLATTEN map(person, (p) => [file, p]) as uniqCombo
GROUP BY uniqCombo
FLATTEN key[0] as aFile
FLATTEN key[1] as aPerson
GROUP BY aFile
FLATTEN rows[0].rows[0].type as bType
FLATTEN rows[0].rows[0].summary as bSummary
FLATTEN array(rows.aPerson) as uniqPersons
SORT aFile.name DESC
```

And the output with another file of Jane talking to and about Jane (4 references to herself :-)), I get this output:

Do note that with a date format of YYYY-DD-MM you’re loosing out on some of the nicer queries of Dataview utilising file.day and proper date handling, and also that your file lists will not be sorted correctly (like in the SORT aFile.name DESC as the day part will keep turning over and clutter up the sorting.

Bonus tip: How to present code properly in a forum post

If you want to showcase either markdown, or code blocks, or dataview queries properly in a forum post, be sure to add one line before and one life after what you want to present with four backticks, ````. This will ensure that any other backticks (like for code blocks) is properly shown.

1 Like

The idea was correct, but the implementation was not up to par, so here is another version of this which make the fourth column (aka r[3]) presumably consisting of links into a unique list of links.

```dataviewjs
const result = await dv.query(`
  TABLE type, summary,
     filter(file.outlinks, 
        (o) => contains(o.file.folder, "persons")) as person
  FROM [[]] and -#TaskExclude and -"templates" and -"cases"
  sort file.name desc
`)

if ( result.successful ) {
  const values = result.value.values
  
  values.flatMap(r => {
      const seenLinks = {}
      r.uniqLinks = []
      
      // Change the column index, if needed
      r[3].forEach( l => {
        if ( l.path in seenLinks )  return

		seenLinks[l.path] = 1
		r.uniqLinks.push(l)
	  })
    }) 
  
  dv.table(
    result.value.headers, 
    values.map(r => [
      r[0], r[1], r[2], r.uniqLinks
    ])
  )
} else
  dv.paragraph("~~~~\n" + result.error + "\n~~~~")
```

The code should in theory be reusable for other queries, and the changes one would need in that case are:

  • Replace the query at the top with the query of your choice
  • Change the column index around the middle of the query to the column of the list of links you want to make unique. Remember that the first column is 0, so the fourth column is 3
  • Replace the column list within the dv.table() call near the end to match your query, and replace the link column, e.g. r[3], with the unique list, e.g. r.uniqLinks
1 Like

Thank you!

Super awsome! the note title format I usually use is YYYY-MM-DD_hhmmss. Would you recommend against that format?

There is one issue with the dataview querry to be mindfull of.
If a note does not contain a outfile link with persons in it will not be included in the table!

The dataviewjs is just what I needed thaks!

It’s better than the previous variant, as this format will allow dataview to at least pick up the date part, even though the time part will not be included.

To make it fully compliant you would need to replace the _ with T, and frankly I don’t like to see that variant myself. And you wouldn’t gain that much in most cases, as one usually don’t need the time part…

That seems reasonable since empty lists are ignored. It could most likely be countered with using default() related to the first definition of person, but I’m a little unsure how that would affect other parts of the query.

No problem, have fun!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.