Allow links in YAML front matter; Notion-like databases from metadata; links as first-class citizens

Sorry @pyrotecx, I just want to clarify:

What is &l1? Is that a tagging feature already built into YAML?

If I understand correctly, under your proposed syntax Obsidian will only recognize a link if either:

  1. it’s included under related: or
  2. it has the YAML tag !link
    If neither or those conditions are met then it will just be metadata, but not an Obsidian recognized link?

Perhaps this might need to be a separate feature request, I’m not sure but:

What if each custom property like director: or genre: could optionally have it’s own Obsidian note automatically created. This could have the following benefits.

  1. When editing YAML front matter, Obsidian could suggest code completion for us.
    • For example: While typing gen, Obsidian could overlay a list of possible completions like genre: and general: etc.
    • Advantages of this approach:
      • Code completion helps us not make accidental duplicates when we misspell.
      • faster typing
      • we’re already familiar with code completion when we type [[ and Obsidian suggests notes for us
  2. When we add an attribute to director:, Obsidian could auto add backlinks.
    • For example:
      1. Obsidian could auto-add the Ridley Scott note as a “child” of the director note.
      2. Obsidian could auto add is a: director to the YAML front matter of the Ridley Scott note.

This way we can make simple changes in one place and have related changes propagate out to other places automatically.

2 Likes

This is great idea/workaround/solution for extending links into whole yaml front-matter when recognition of links in the field “related” is implemented. I would consider incorporating it into main post or at least point to post #9 from the main post.
Additionally, for readability in basic text-editors, we can use yaml comments to materialize text locally near YAML references.

---
is a: &Movie Movie
director: &l2 Ridley Scott
run time: 2hr 
IMDB rating: 7.8 
genres: &l3 [crime, drama, mystery]
related: 
  - a link
  - another link
  - *Movie
  - *l2 # Ridley Scott
  - *l3 # [crime, drama, mystery]
---
2 Likes

@Fanshu , @luke85
To vote for the feature request, do not forget to press :heart: under original post, officially prefered method of voting according to FAQ:

Rather than posting
“+1” or “Agreed”,
use the Like button.


@DandyLyons Yes, these are yaml anchors and yaml custom types.

2 Likes

@pyrotecx
Thank you for clarification of what you mean when you say “backlink”.

  • When a page “P” contains [[A|Link to page A]], then From viewpoint of page “P” it is called just a “link”. It is called “backlink” only from viewpoint/perspective of the page “A”.
  • Obsidian is said to have “backlinks” functionality because it is able to show which pages do link to currently active page “A” (via “backlinks pane”).
  • I suggest to edit the original post to change relevant instances from “backlink” to “link”, to avoid confusion of readers (especially the novices) due to misuse of terms. I have encountered this confusion several times in recent times since introduction of Roam Research. I imagine how easily the self-strenghtening misuse of (now cool) term “backlink” can lead to messy discussions and consequently to the need for new word (e.g. “backbacklink”) in order to express/distinguish the original meaning from potentially new common interpretation of word “backlink”.
  • For example, part of the title will change to Allow links in YAML front matter; which explains the intention immediately without need to decrypt anything and so potentially attracts more votes from people who do not have time to read the whole post.

Edit: thank you for making the corrections.

4 Likes

Hello all,

Thank you for Obsidian.

TL;DR;

Here’s a simple suggestion for the problem of links in YAML:

  • Treat all -almost- YAML scalars as potential labeled links (implicit, without any special link syntax)
  • This would include those scalars that are deeply buried within nested YAML structures

This may sound awkward or even stupid at first, but please bare with me…

In a freeform mostly unstructured context (like markdown), we obviously do need a special syntax for links and the like.

In a structured context (like YAML), however, information is already entered and than parsed in small chunks.

Quite often, those chunks refer to other entities (whether they aready exist as an note entry or not).

Most of the time, even things that seem as pure attributes (like dates, e.g.: “2021-03-16”) have the potential to become first class notes (entries) some day: if the user so desires.

This kind of approach (or something equivalent) would allow Obsidian to bring together the best of both worlds (structured and unstructured approaches mentioned by @Whitenoise) for managing a knowledge base on any topic.

Any entry may start out as an unstructured freeform markdown note…

With time, as the need arises, structure could emerge… naturally and gradually… all within the same “entry” (file).

EXAMPLE

Below is an example that illustrates both a use case and the workings of above solution.

The demonstrated example happens to be for a hobby knowledge-base for plants, but the suggested solution is entirely generic so it could be applied to any use case involving semi-structured granular knowledge-base on any topic (movies, books, articles, …, anything).

YAML Front matter (and then some)

---
title: Spinach
author: John Smith
tags: [“gardening”, “plant/vegetable/leafy”]

source: 
  - {name: Wikipedia, link: "https://en.wikipedia.org/wiki/Spinach", primary: true }
  - {name: Wikimedia, link: "https://upload.wikimedia.org/wikipedia/commons/c/cd/Spinach.jpg" }
---
type: species
name: Spinach
species: Spinacia oleracea
lifecycle: annual
taxonomy:
    genus: Spinacia
    family: Amaranthaceae
    clade: Tracheophytes/Angiosperms/Eudicots/Caryophyllales
    kingdom: Plantae

…

Followed by the BODY (markdown)

Spinach (Spinacia oleracea) is a leafy green flowering plant native to central and western Asia. It is of the order Caryophyllales, family Amaranthaceae, subfamily Chenopodioideae. Its leaves are a common edible vegetable consumed either fresh, or after storage using preservation techniques by canning, freezing, or dehydration. It may be eaten cooked or raw, and the taste differs considerably; the high oxalate content may be reduced by steaming.

bla bla bla...

Optionally followed by REFERENCE LINKS

[spinach:primary]: https://en.wikipedia.org/wiki/File:Spinacia_oleracea_Spinazie_bloeiend.jpg
[spinach in the fields]: https://upload.wikimedia.org/wikipedia/commons/c/cd/Spinach.jpg

What are the links in the above example ?

All the scalars found within the YAML frontmatter(s) above can be treated as potential -labeled- links:

  • Spinach
  • John Smith
  • gardening
  • plant/vegetable/leafy
  • Wikipedia
  • https://en.wikipedia.org/wiki/Spinach
  • Amaranthaceae

It gets better… Those are already labeled links, ready to be represented on a directed labeled graph (or go into a graph database :-), or just generate semantic triples…

For example :

link label(s)
Spinach doc.title, name
John Smith doc.author
Wikipedia doc.source[0].name, doc.source.name
Spinacia oleracea species
annual lifespan
Spinacia taxonomy.genus

The above mapping convention probably needs some discussion and improvements. But it should give the general idea.

How does this approach compare to other solutions ?

PROS:

  1. Brings together the best of both worlds of managing data (structured and unstructured), allowing structure to emerge naturally and gradually.

  2. Generic solution for all kinds of data and metadata, without the need to reinvent the wheel for each and every problem domain (authors, articles, plants, movies, recipes, …)

  3. Totally compatible with existing -and future- YAML syntax, without the need of any custom convention.

  4. YAML data stays clean, void of extra syntactic noise (which might have otherwise been problematic for automated processing of YAML by other scripts that could choke on any special link syntax).

  5. Gives us an easy and natural way of entering labeled “semantic” links (using dot notation or similar)

  6. Probably quite easy to implement without breaking the existing Obsidian code (this is just a guess; I don’t have sufficient knowledge of the code base)

CAVEATS:

  1. Could lead to “link noise”, listing too many things as links or backlinks.

    This risk could easily be mitigated by some furher implementation choices:

    • Ideally, such links could internally be marked as “implicit” and the user could be given the choice of displaying those (or not)

    Also, auto-creation of unwanted entries should be avoided.

  2. Could lead to structrure frenzy… prompting some users to over-structure their data, unnecessarily or just too early.

    Well, when something is so easy (to do)… it takes some discipline to learn not to over-do…

OPEN QUESTIONS :

  • What impact on performance ?

FUTURE POSSIBILITIES (for later on)

As the careful readers have probably noticed, this concept could be extended in may ways differnt later on… enabling many other goodies…

Here are just a few :

  • Basic text templating within the markdown body (something like jinja2 syntax or similar)
    This would help avoid repeating oneself and also minimize errors.

  • Detect URLs and possibly treat them differently

  • More knowledge-base/graph capabilities (or exporting triples into graph databases) thanks to implicit labels that come from YAML

  • Ability to specify labeled links within the markdown body itself, in a way that is interoprable with the described approach (in YAML).

The last two items resemble the #juggl add-in being developed by @Emile (the current version is called Neo4j Graph View)

Needless to say, care would be needed to avoid feature-creep and software bloat.

Obsidian was intended as an easy to use note taking app, in which it excels. So, it should not evolve into some complicated bloatware.

IMHO, with a few carefully chosen features, it has the potential to fill a huge gap… and still remain simple.

As Larry Wall once coined it: “Simple things should be easy, and complicated things should be possible [to accomplish] :-)”

Thank you for reading this far :slight_smile:
Tabulo[n]

10 Likes

I like the idea of using scalars in yaml rather than a second front-matter block, which seems kind of odd.

2 Likes

Hm. I do somehow like the concept of YAML frontmatter, but please let me object to this idea.

Why? Obsidian is a real great PKM application, and I enjoy it a lot, but I think we have seen too many “unplanned-for” uses for YAML and code blocks lately. Just because it can be done it mustn’t always be done.

I simply fear too much non-standard uses of code blocks and YAML frontmatter creeping in, with the eventual possibility of Obsidian becoming more and more of a “locked-in” system again.

Let’s take myself: I write a lot of nonfiction and reference material and have traditionally been using YAML frontmatter mainly as kind of “preprocessor directives” for LaTeX output. I fear that if YAML scalars would be made “auto link targets”, it might simply generate to much “digital noise”.

Even a small text here already has something like the following front matter:

---
title: 'Linux für Dich: Electronic Publishing mit Markdown'  
documentclass: 'scrbook'  # book (no abstract), memoir, …; default: article
author: 'Matthias C. Hormann'  
email: '[email protected]'  
editor: 'Matthias C. Hormann'  
publisher: 'Eigenverlag'  
copyright: '© 2015 Matthias C. Hormann. All rights reserved.'  
rights: '© 2015 Matthias C. Hormann. All rights reserved.'  
cover-image: 'images/cover-image.jpg'  
stylesheet: 'supportfiles/epub-style.css'  
css: 'supportfiles/style.css'  
highlighting-css: 'supportfiles/highlighting.css'  
date: '2015-08-07'  
lang: 'de'  
keywords: 'Markdown, Electronic Publishing, epub, PDF, HTML, how-to, Linux, Ubuntu, Mint'  
subject: 'Markdown, Electronic Publishing, epub, PDF, HTML, how-to, Linux, Ubuntu, Mint'  
header: ''  
footer: 'Linux für Dich: Electronic Publishing mit Markdown'  
numbersections: 'yes'  
#geometry: 'a4paper, portrait, twoside, includeall, nomarginpar, inner=20mm, outer=20mm, top=20mm, bottom=20mm, bindingoffset=10mm'  
#geometry: 'nomarginpar, twoside, bindingoffset=10mm'  
date-meta: '2015-07-28'  
pagetitle: 'Linux für Dich: Electronic Publishing mit Markdown'  
title-prefix: ''  
quotes: 'yes'  
crossrefYaml: 'supportfiles/pandoc-crossref-de.yaml'  
bibliography: 'literatur.bib'  
biblio-files: 'literatur.bib'  
#biblio-title: 'Literatur'  
csl: 'supportfiles/iso690-numeric-brackets-de.csl'  
description: 'Linux für Dich: Electronic Publishing mit Markdown'  
#toc-title: Inhalt  
lof: 'yes' # List of Figures  
lot: 'yes' # List of Tables
lol: 'yes' # List of Listings
#abstract-title: Kurzübersicht  
#abstract: |  
# In der Reihe "Linux für Dich!" findest du praxisnahe Tipps und Tricks für den alltäglichen Umgang mit deinem Linux-System.
# 
# In der Ausgabe "Electronic Publishing mit Markdown" wird eine hocheffiziente Arbeitsumgebung für Autoren beschrieben – von der Installation benötigter und sinnvoller Programme bis hin zur #Erstellung fertiger Dokumente als E-Book, PDF oder HTML-Dokument.
# 
# Das Buch wendet sich an Autoren, die unter Linux (Ubuntu oder Mint) arbeiten und sich zielgerichtet möglichst ohne Ablenkung aufs Wesentliche konzentrieren wollen: Das Schreiben – und die effiziente Weitergabe ihrer Texte über verschiedenste Medien. Angesprochen werden Studenten, Lehrende, Ersteller technischer Dokumentationen und professionelle Autoren.
# 
# Ausgehend von einer einfachen und vor allem *lesbaren* Textdatei wird der Weg bis zum fertigen Buch (oder einer Anleitung wie dieser) erklärt – zielgerichtet, ohne Ablenkung, reproduzierbar, möglichst automatisiert bis zum fertigen Dokument in den verschiedensten Ausgabeformaten.
# 
# Es wird erklärt, welche Werkzeuge und Programme man *wirklich* braucht und wie sie genutzt werden:
# 
# * *Geany*, ein Texteditor (es kann aber nahezu jeder andere verwendet werden)
# * *Markdown*, eine einfache Syntax für Textdokumente
# * *JabRef*, Software zur Verwaltung von Literaturangaben und Zitaten
# * *LaTeX*, eine Satz-Software, die »im Hintergrund« für schöne Ausgabe sorgt
# * *Pandoc*, das »schweizer Armeemesser« unter den Dokumentenkonvertern
# * *Calibre*, Software zum Verwalten, Anzeigen und Bearbeiten von E-Books
# * *Freeplane*, eine Mindmapping-Software zum visuellen Gedankensammeln

---

and it also has many code blocks, with the intention of having syntax-highlighted code examples and instructions.

Whatever gets decided upon, please don’t break compatibility for people who use Obsidian to write academic papers, nonfiction and text books intended for LaTeX output.

I’d be interested what others think about this.

4 Likes

“non-standard” bells and whistles feature requests keep growing.

Let’s hope most of them do not get implemented.

Simplicity is key, or database-based-apps-with-an-export-to-text-linked-files-with-backlinks will become the better alternative.

Drafts was great. Then, it was flooded with bells and whistles.

1 Like

I think this important. After all, what has evolved? edWordStarWord. A monster. Probably that’s why we found back to Markdown. Compatible. Simple text. Easy to use.

Combined with an easy-to-use, blindingly fast tool like Obsidian: Allow to focus back on your own, original ideas. No distraction but thinking and output again.

I’m not against “bells & whistles”. Just against the distraction they can provide. Which often hinders great new ideas to evolve. Even WYSIWYG too often makes people think of cosmetics instead of content.

And that was what Obsidian was meant for. I think.

Well, actually, the way I see it: the idea of “treating scalars in YAML as link potential targets” doesn’t appear to be related to the question of “single frontmatter” vs “second frontmatter block”.

To me, that’s another -ortogonal- discussion. The above idea doesn’t necessarily take any sides in that discussion. It could work either way (with 1 frontmatter block, or 2, or 3, or whatever).

Given some of the above comments, I do see why you are bringing it up, though…

The OP proposal was wikilinks in front matter and that’s not valid yaml. The alternatives are 1) second, non-yaml front matter like blocks or 2) interpreting existing front-matter scalars as links.

I do view them as alternatives for this use case and not orthogonal.


about: “Treating all YAML scalars as potential implicit link targets”

To clarify things a bit:

Below is how a plausible/reasonable implementation of the suggested behavior (of treating all YAML scalars as potential links) could look like.

It’s broken into a collection of little sub-features, some of which could be useful on their own.

EDITING/PREVIEWING/BROWSING:

  • Any YAML scalar that does not correspond to an existing file would be treated the same way as it is today. Nothing special there.

  • If a YAML scalar value does correspond to the name (or relative/absolute path) of an actually existing file in the current vault, or else if it looks like a URL, Obsidian could display it as a link, and allow you to open it (or peek at it) in a way similar to a [[wikilink]] (using Ctrl-Click or something similar) or markdown link.

    This way, if you so desire, you could keep extra information (notes or data) about the entity being referred to by that scalar, and easily navigate in between.

    An algorithm very similar to (if not the same as) the one used for wikilinks can be used for deriving the file names and paths to try (for example : with or without an “.md” suffix, with or without folder path, etc.)

    Also, there is no reason to artificially restrain the target file type or extension to markdown, in my opinion:

    • ==> If the file exists, or if the scalar looks like a URL, just show it as a link. And then, when the user Ctrl-clicks on it, we open it (or browse to it) if we can; or else indicate that we can’t.

DATA ENTRY (searching for an existing note and inserting its filename):

  • Again, nothing fancy should be needed or forced upon for entering scalars (be it actual link targets or otherwise). You should just be able to type them in the way you are doing it today, even for filenames / paths.

  • When entering a a reference to a note, or any filename/path, though, you could just press a keyboard shortcut (or right-click and choose a context menu) which would allow you type a fuzzy search term and then pick a file, whose name (without any eventual “.md” suffix) would then be textually inserted where the cursor happens to be…

    From a pure functionality point of view, it would produce something very similar to what happens when you enter wikilinks (except that the trigger would be keyboard shortcut, not “[[”) and could eventually share much of its code with the wikilink stuff.

    In addition, the user could also be given the choice of what text actually gets inserted (just the file name, or a path relative to current file or the vault), whose default could come from vault preferences.

    In fact, this particular helper doesn’t even have to be limited to this proposal (or YAML for that matter). It could be providing a generic functionality (perhaps written as a plugin) which could be triggered anywhere where you can enter text.

CREATING A NOTE (from selected text):

  • Another keyboard shortcut (or context menu) could trigger a functionality that allows creating a note whose name is derived from the selected text.

    This would do something quite similar to what happens today when you Ctrl-click a [[wikilink]] that doesn’t correspond to an existing file, except that it would be using the selected text instead.

    Again, this helper, which can be written as a plugin, doesn’t have to be specific to this proposal or YAML; it could generically work anywhere in the editor wherever you can select text.

GRAPH VISUALIZATION & LINK ENUMERATION:

For graph visualization purposes, or when enumerating forward/back links, implicit links (scalars that actually correspond to existing files) could well be counted and displayed, if the user so chooses.

In fact, imho, this is the only place where link noise (or digital noise, if you prefer) could potentially creep in, if not implemented carefully.

So, it’s important to give the user a real choice (both in preferences, and also, right where the visualization actually occurs) as to whether or not to count/display those implicit links.

BTW, typed/labeled links, when available, would naturally be supported in this scheme of things; as the link type can be derived from the YAML key-path, as described in the initial suggestion.

PERFORMANCE CONSIDERATIONS

For performance reasons:

  • when trying to detect implicit links in YAML scalars, it may be possible to skip any scalar whose textual length is longer than a configurable value (which could default to 1024 or something).

  • also, it would be probably very helpful to keep an in-memory cache of the vault index (which is already being done, I think).

Tabulo[n]

Hi @pmbauer,

OK, yes, I see your point : The suggestion (2) would indeed eliminate the need for non-valid yaml… and hence also the need for another block (1) of non-valid yaml which would otherwise be needed just for achieving links (if those were implemented with a special syntax, like the OP).

On the other hand, there may be other reasons (than links) for wanting the ability to have a second block (or more).

For example:

  • The usual front-matter typically contains metadata (i.e. data about the current document/note).

    In some cases, however, the note itself is a mere entry that is intended to represent a real world entity or concept (movie, song, plant, animal, person, …), just like “spinach”, as in the plant PKB example above.

    There, I can see why it would be desirable to have the ability (but not the obligation) to have subsequent YAML block(s) following the usual front-matter, where one could place data about the real world entity represented by the entry, in a way that is at least visually distinguishable from metadata (about the note/document).

Hi @Moonbase59,

I hear your concerns… And on a very high-level (abstract) level, I tend to agree with most of them, or at least share the sentiment.

However, my apologies for failing to understand how those map concretely to the suggestion at hand…

For example, in what concrete way(s) do you see the suggested behavior :

    1. going against the simple/generic PKM scenario ?
    1. just another “bell & whistle”? (or bloat?)
    1. breaking your current workflow ?

In fact, one of the main constraints I had in mind when jotting that down was to avoid breaking anything at all: either YAML (which doesn’t have any native link syntax), or Markdown, or Obsidian, or any current (and future) workflow that relies on existing behavior, including those like the example you have described.

Also, you do probably realize that a lot of the metadata in your own example could well be considered as “unplanned-for” usage of the YAML front-matter at least by some, right ?

Obsidian itself currently “understands” just a few of those (title, keywords, …). So, strictly speaking, we could say that any of the others weren’t really “planned-for” by Obsidian

Luckily, Obsidian is quite flexible (which probably was planned-for), so it doesn’t mingle with anything it doesn’t understand, leaving those alone instead…

Hence, you are able to leverage that generic flexibility to fit your own workflow (for me, that’s all fine).

Likewise, some of the metadata in your example (e.g.: author, editor, publisher, documentclass) and even some of the preprocessor directives (e.g. stylesheet, css, highlighting-css) could well benefit from the suggested behavior (if you wished so, of course)

I went ahead and jotted down what I consider to be a reasonable implementation of the suggested behavior (of treating all YAML scalars as potential link targets) as a response to my initial post.

You may want to take a look.

In a nutshell, this proposal, if implemented, would imply Obsidian displaying Eigenverlag as a link if (and only if) there is a corresponding note or file for it (like eigenverlag.md) in your vault. Same goes for supportfiles/style.css.

Any scalar that doesn’t have a corresponding note would look and behave the same way as it does today…

Noisy? Complicated?

Well… In the end, I have feeling this may actually boil down to what one considers a simple PKM is or ought to be…

Tabulo[n]

I like YAML a lot, I just don’t like it all the time :grinning: Many of my projects need a little, some projects would benefit from much systematic metadata. Some project don’t need any metadata.

Simplicity and flexibility are key just even for my varied uses of Obsidian.

Maybe I am missing something, but couldn’t YAML be implemented as a plug-in? Then people could choose if (and hopefully how) to use YAML to best support their projects?

1 Like

YAML front matter is a de facto metadata standard for markdown used by editors and static site generators. Obsidian uses it for tags and aliases as a core feature. So no, I don’t think front matter support as a plug-in makes sense.

2 Likes

If tags and aliases do all you need, that’s great. That’s most of my uses too. I can think however of uses, e.g. a curated exhibition inventory used for internal and external purposes where additional YAML attributes would be most helpful.

The system is already flexible in that regard

  1. You don’t have to use the YAML block if you don’t want to.
  2. You can use the few YAML attributes handled by Obsidian
  3. Third party plugins can be implemented to handle more attributes for specific workflows.

The point of this discussion is if the Obsidian’s “Linked Mention” system (or even unlinked mention detection) should work on the YAML block.

1 Like

That would be nice to have a UI selection for just that in Settings. Right now, unless I’m mistaken OR doing something wrong, ‘unlinked mention detection’ lists way too many suggestions, even when trying to limit the search parameters. My thought is that a limitation to YAML Aliases (and/or Tags) would be more advantageous unless I’m misunderstanding the issues.