Flattened dataview of sublist items without explicitly specifying the sublist items

What I’m trying to do

TABLE materials
FLATTEN materials

gives me:

File5 materials
File2 * pen: value of pen in file2
File2 * cats: value of cats in file2
file1 * pen: value for pen in file1
file1 * paper: value for paper in file1

This is almost right, I want to separate the subproperty name from the subproperty value, into something like

File5 property Value
File2 pen value of pen in file2
File2 cats value of cats in file2
file1 pen value for pen in file1
file1 paper value for paper in file1

(I will put these in a separate column)

and I don’t want to have to specify the subproperties.

Things I have tried

starting from frontmatter:

materials:
- pen: value for pen in file1
- paper: value for paper in file1

and

materials:
- pen: value of pen in file2
- cats: value of cats in file2

With the query

TABLE split((materials.paper + ", " + materials.pen + ", " + materials.cats), ", " )
FLATTEN materials

I get the values I want:

File5 materials.paper + materials.pen + materials.cats
Dataview
File2 value of pen in file2
File2 value of cats in file2
file1 value for pen in file1
file1 value for paper in file1

However, I need to add the subproperty back in in one column, and I don’t want to have to explicitly specify the subfields. I just want it to list whatever subfields are there.

I am willing to restructure the data if needed.

Here’s a brute force approach…

```dataview
TABLE split(string(M),":|{|}")[1] AS "Key",split(string(M),":|{|}")[2] AS "Value"
FLATTEN materials AS M

Nice, that’ll do. I don’t need anything too elegant.

1 Like

So I just wanted to illuminate some other options on how to tackle an issue like your besides the brute force variant, which do work in a lot of cases but not all. But before going into those I would say that I’d prefer a better structure like doing something like:

---
materials:
- name: pen
  value: its value
- name: paper
  value: the paper value
---

This structure would in many cases be easier to work with, and be a little more coherent as a data structure. Still a little hard to work within Obsidian/Dataview, but better than the brute force suggested by your structure.

Another thing I would like for Dataview to do was to offer something similar to Object.keys(...) from javascript which could extract all the keys automagically from materials, but alas we don’t have that as of Dec 2024. I’ve considered adding it to Dataview, but haven’t gotten around to it yet.

A slight variant of the brute force

As the answer by @FsOver2 stands, it’s not easy to re-use the key and values if you wanted to do the sorting, filtering or grouping (or similar) on those values, so here is a slightly different variant over the same theme:

```dataview
TABLE mat[0] as Key, mat[1] as Value
FLATTEN materials as M
FLATTEN list(slice(split(string(M), "[:{}]"), 1, 3)) as mat
```

It’s using the same base principle of changing M into a string (thereby loosing some information related to links and dates), and splitting it based on the either of the characters, {, : or }. But in addition I use slice() to loose the empty first and last element, and I use FLATTEN list( ... ) as mat to store the resulting list into mat. Now mat[0] (as the key) or mat[1] (as the value) can be used in WHERE, GROUP BY, SORT, and so on if you want to.

Using explicit referencing

Dataview can by itself dereference rather strange structures if you’re willing to name the parts you want. So doing stuff like M.pen, or extract(materials, "pen") are valid stuff to extract bits and pieces out of compound objects. Sadly, it can be a little confusing when working with compound objects having multiple elements.

One big advantage though is that this could preserve the type of element are contained in the compound objects. So for the following queries I extended the examples given with these fields:

- purchased: 2024-12-01
- my_link: "[[t92790 file 1]]"
Some various queries with explicit referencing

I played around a little and used the following queries to showcase some various bits and pieces related to how to dereference using explicit names:

## Explicit referencing

```dataview
TABLE M, string(M), M.pen, M.paper, M.cats, M.purchased, M.my_link
FLATTEN materials as M
WHERE M
```

## Extract stuff...

```dataview
TABLE extract(materials, "pen").pen, materials.pen, Pen, Paper, Purchased, My_link
WHERE materials
FLATTEN nonnull(materials.pen)[0] as Pen
FLATTEN nonnull(materials.paper)[0] as Paper
```

## Another explicit variant
```dataview
TABLE Pen, Paper, Purchased, typeof(Purchased), My_link, typeof(My_link)
WHERE materials
FLATTEN nonnull(materials.pen)[0] as Pen
FLATTEN nonnull(materials.paper)[0] as Paper
FLATTEN nonnull(materials.purchased)[0] as Purchased
FLATTEN nonnull(materials.my_link)[0] as My_link
```

The output of the two first queries are as follows:

As can be seen, they do produce extra lines, and are somewhat unwieldy to work with.

The last query in the block above shows some promise though, so I’ll repeat it here:

TABLE Pen, Paper, Purchased, typeof(Purchased), My_link, typeof(My_link)
WHERE  materials
FLATTEN nonnull(materials.pen)[0] as Pen
FLATTEN nonnull(materials.paper)[0] as Paper
FLATTEN nonnull(materials.purchased)[0] as Purchased
FLATTEN nonnull(materials.my_link)[0] as My_link

Which outputs this table:

I find the fact that both Purchased and My_link are now properly showing that they are a date and a link. This allows for various other effects such a date and/or link manipulation, if one would want that. This is something the brute force alternatives wouldn’t achieves unless you convert stuff back into dates or links.

It’s kind of a kludge to do that nonnull(...)[0] stuff to pull out the non-null elements from materials, but one advantage is that we still are processing each file as one row but still with access to each material of that file. Yet again, this depends on the use case, but I wanted to showcase this as an alternative to show both the preservation of the value type, and the possibility to give access to the different materials other than the pure display purpose of the first brute force solution.

Hopefully it can help someone in the future stumbling across this post! :smiley:

3 Likes

@holroy, as alway, a very clear and thorough explanation. I really liked your use of list and split. That solved the issue I really didn’t like about my solution. Your answer is much clearer :+1:. I definitely learned a new approach. Thanks!

Thank you for the time and thought you put into this. It helped me understand better what was going on in the brute force solution, and I am sure it will help others.

I understand your reasoning for structuring the metadata within name, value sublists, in that it provides a general layer of abstraction and gives similar items a hook. This would be good for adding futher parameters:

---
materials:
- name: pen
  value: its value
  purchaseLink: http://amazon.co.jp
- name: paper
  value: the paper value
  source: home depot
---

Or so I thought, but when I tried this:


---
materials:
- name: pen
  value: its value
  purchaseLink: http://amazon.co.jp
- name: paper
  value: the paper value
  source: home depot
---

```dataview
TABLE mat[0], mat[1], mat
FLATTEN materials as M
FLATTEN list(slice(split(string(M), "[:{}]"), 1, 3)) as mat
WHERE file = this.file
```



I got:

I was expecting all the data…that’s when it really hit me what you were doing: changing the 3 to a large number shows it is slicing it into more pieces, and, for example, the links are not properly being parsed. In my case, I wouldn’t know all or which fields belong to a given material.

Brute force, with added fields

For anyone else thinking along the lines of brute force, this is what I actually did, adding additional info for source and type:

---
mylist:
- pen: value of pen &source http://link.co &price price of paper
- paper: value of paper &source source of paper &type type of paper
- folder: value of folder &source source of folder http://link.cor &price price of folder &type type of folder
---

```dataview
TABLE
	split(string(M),":|{|}")[1], 
	split(string(M),"&source([^&}]+)")[1] as source,
	split(string(M),"&price([^&}]+)")[1]  as price,
	split(string(M),"&type([^&}]+)")[1]  as type
FLATTEN mylist AS M
WHERE file = this.file
```

This is pretty simple to type in and read, and in effect this is very similar to yours, minus the virtue of keeping the type, and no ability to sort.

I reformated it to use your prototype, like this:

---
mylistnew:
  - name:  pen
    source: "[[http://papersource.co]]"
    price: price of pen 
  - name: paper
    source: source of paper
    type: type of paper
  - name: folder
    source: "[[http://foldersource.com]] "
    price: price of folder
    type: type of folder
  - name: pencil
    source: "[[http://pencilsource.com]]"
    
---

and a simple query worked fine and allowed sorting, so that is a point in its favor. To get links to keep their type I had to remember to keep them in quotes and doublesquarebrackets.

```dataview
TABLE WITHOUT ID file.link, M.name, M.price, M.source, typeof(M.source), M.type
FLATTEN mylistnew as M
WHERE mylistnew
SORT M.name
```

It took me a few tries to figure out how to get the subfields right because it had to be exactly 4 spaces, and it kept showing various spurious errors as I was still in the process of typing, which was distracting, but if I plugged through it worked.

2. Your extracts

That said, I wasn’t able to get your extracts working the way I wanted because they were listing purchases by file, not by item.

where pen and pen2 are from different files. In addition, it didn’t catch the other purchases of paper, etc. I must be doing something wrong.

I do have a question, though. I remember now that I did try something very similar to what you suggested before what I showed you:

This obviously didn’t work, especially when I realized I was going to need more than one item on the list.

My question is, can you explain a little how this and your remodeled version are different. What type of object is this, without the dashes?

---
mylist2:
  - item: pen 
    source: http://link2.co 
    price: price of pen2
---

well I have discovered that this is called (by someone) a “NestedObject”. Seems to be a bit different that the “bulletArray” that was my first attempt.

BulletArray:
  - A
  - B
  - C
NestedObject:
  - name: nest1
    color: FF0000
  - name: nest2
    color: 00FF00

The elements (“item: pen”) of the bullet array in my first “mylist2” in the post above are, I gather, interpreted as single elements of type text?, not name value pairs as intended, I am surmising? Meanwhile in the second “mylist2”, is a nestedObject, whose individual elements “item: pen”, “source: …”, and “price: …” are interpreted as name & value, and can be accessed via NestedObject.item, etc.

Still working on finding out what the exact semantics is, cause the dataview documentation itself doesn’t seem to mention it, but does have similarly syntaxed “Objects” which are “multiple fields under one parent field”.

obj:
  key1: "Val"
  key2: 3
  key3: 
    - "List1"
    - "List2"

where key3 seems to be a bulletArray type “field” under the “object” “obj”. Here the keys seem to be interpreted as name/values, but each of different type (text, number, and list of text?).

I gotta say, I wish there were better documentation/nomenclature for all these similar syntaxes.

I am also trying to find out more about the so called “link” type is. It took me a while to find out that a dataview link "[[link]]" is different from an obsidian link [[link]] and they seem to be incompatible. That certainly “explained” a lot of things.

This is somewhat unprecise, but I do understand the confusion. Within a Dataview query you can use just [[link]] to mean a link, and also in the body context (inside or outside of inline fields). However, within the property section (aka yaml or frontmatter), the currently approved link format is to use "[[link]]" as in the following example:

---
myLink: "[[aLink]]"
---

And here is [[AnotherLink]], followed by a [inlineLink:: [[YetAnotherLink]]].

And this query _should work_:

```
TABLE WITHOUT ID myLink, inlineLink, file.outlinks
WHERE myLink = [[aLink]]
```

Regarding the documentation, it’s hard to write good documentation, but I think you’re starting to see the gist of it, but please do post new issues asking related to details.

That’s a very good clarification, thank you. I would like to be more precise, but for example, the distinction you mentioned is I think more implied than explained (to my liking) in the references I’ve seen. I was actually talking about a different distinction, i.e. bulletArray vs NestedObject, which I haven’t found clear explanation of in one location. I am not sure if these are “approved” terms. I am guessing it may be a problem because there is a confluence of 3 different standards (YAML (notoriously strict), dataview, and obsidian) each vying for the same semantic space. Hijinks ensue.

PS I’ll add that maybe I’ve just garbled things up in my head backwards upside out inside down due to some false assumption or quick surmisal, and it is actually simple for normal human beings once you get it. It’s happened. More than I care to admit.

The confusion is understandable, and do take a little while to get used to. Regarding naming I’m mostly using list values vs compound objects.

If you look at a standard properties section that can be seen as a compound object consisting of different properties, what happens in your nested object example is just that instead of all the properties being in the outer compound object (with no indentation), they’re belonging to the list item with some indentation and using - to mark the start of the list item. Does that sense?

---
aList:
- First value
- Second value
CompoundObjectList:
- name: First object
  color: FF0000
- name: Second object
  color: 00FF00
---

I notice that I tend to not indent the first list in either case, which makes my example look a little more coherent (to my eyes). Here you see that both aList and CompoundObjectList are in fact lists starting each of their elements with - . The only difference is that for the compound object list we’re using the key: value syntax, and we repeat key: value lines with the same indentation.

Thank you. I really appreciate this. I think I got it straight:

TL;DNR Basically, first you are emphasizing that each element of the list CompoundobjectList is an object, and the first dash is identifying the beginning of each element of the list. I get that. As to the indentation, you prefer to omit the indentation that would suggest that each list is a nested object in lieu of emphasizing its list quality. I need to look up what the rules are for YAML indentation, but if I have that right, I see what you are saying.

BELOW IS ONLY for strong of heart who like me needed a few tries to get to the point above:

In nomenclature, consistency is very important. I have a very ad hoc coding background (being an amateur at best) and tend to use terms inconsistently as a result. Secondly, I am of an older generation and also more of a linguistically minded person, so I tend to try to parse things as parts of sentences rather than in computer science terms, at first impression. For example, “how do I use it in a sentence” is one of the ways I make sense of things.

Where I am leading is that, though I am still not consistent in this usage and probably not even getting the modern terminology correct, my “received” internal schema is that objects have properties, each property consists of a pair, a key (which has a name) and value (which can be of many types). Lists have individual elements, each element must be of the same type.

That said, let me see if I have this straight.

The name of the first property is “aList”. In other words aList returns a list, or it “is” a list. Another way might have been to call it “Property1key:” of type: list, meaning Property1value is a list. The first element of the list is being called “First value”.
There is one place I stumbled. It makes sense, but my old brain mussed it up.
I would have called it element one, but I see it is the first element in the list that is the VALUE of Property1, i.e. Property1valuelistelement1. It is element1 of the list that is the value of property1. But that is not important, except for me to keep it straight in my mind.

CompoundObjectList is the name of a property, whose value is also of type list, i.e. it “is” or returns a list. Each element of the list, however, is itself an object. The first element of CompoundObjectList is an object (I would call it “element1object”, or CompoundObjectListvaluelistelement1object, the object that is the first element of the list that is the value of the property called CompoundObjectList - it was unfortunate for me you called it a list, but I see it returns a list). This object (CompoundObjectListvaluelistelement1object) has 2 properties: one of them is name, the other is color. What confused me was that the value of the property “name” is First object – I guess I was expecting something like elemen1object’s property1value. I stumbled at that at first, but I realized that it is saying the name of element1object is “First object” (aka element1objectProperty1keyname) , just like the color of element1object is FF0000. I hope I am getting that right.

Why am I being so precise. Because it was going through all that finally “sensed” :wink: your subsequent paragraph for me.

1 Like

Sometimes to help me understand the structure or visualise the data I’m working with I use queries like the following:

```dataview
TABLE WITHOUT ID 
  aList, typeof(aList),
  aList[0], typeof(aList[0])
WHERE file = this.file
```

```dataview
TABLE WITHOUT ID
  CompoundObjectList as C,   typeof(CompoundObjectList) as "type(C)", 
  CompoundObjectList[0] as "C[0]", typeof(CompoundObjectList[0]) AS "type(C[0])",
  typeof(CompoundObjectList[0].name) as "COL[0].name",
  typeof(CompoundObjectList[0].color) as "COL[0].color"
WHERE file = this.file
```

Which when used in the file of my last example produces output like:

This can help identify that what you reference in your queries is what you actually expect it to be, and using typeof( ... ) you can verify that the type of the property/field is what you expect it to be.

1 Like

I appreciate your reply. I had caught on to that, but it took me a while to catch on that for dataview, arrays and lists are the same. This is not intuitively obvious for me, as list means other things in other contexts (Markdown, HTML, dataview). I kept looking for a distinction that wasn’t there.