A brutalist approach to knowledge management in Obsidian

a_bahez · May 31, 2023, 11:14pm

I apologize if I sound like a party pooper, but I suppose my opinion in this case is obvious. I will discuss some points that I summarize below:

More context on Brander’s proposal
The problem with ontologies and markup languages
The why before the how
Obsidian is not the best tool for semantic ontology
But if I had to force it in that direction, what would I do?

Context on Gordon Brander’s ideas

The text I reference from Brander about links is written in the context of an exploration the author is doing to build Subconscious, an alternative to Obsidian. In that text, it is important to remember that for each proposal in which a link replaces things like tags or folders, a technical implementation that processes the links is needed. A link cannot replace a tag without a backlink that shows all the references pointing to it. Similarly, a semantic triplet would require a technical implementation that processes the text and link address in a fundamentally different way from the original markdown specification. In another text, Brander takes a position against a syntax that mixes the link address with the displayed text and promotes a more radical separation between an editing mode (source view) and a preview mode. According to him, [[wikilinks]] should be replaced by /slashlinks (that is, references that only contain the address and not the prose text).

Ontologies and Markup Languages

With that context in mind, what are we exactly trying to accomplish? Let’s remember that markdown is a lightweight way to write HTML (and that HTML is a subset of XML). In HTML, links, like other elements, are capable of having attributes, and in fact, the link address is an attribute of the <a> tag, the href, like <a href="link-address">Prose Text</a>. In specifications like the semantic web’s OWL, ontologies are worked with through attributes, properties, and values based on RDF. Markdown does not have such a purpose or specification. Putting all that aside, investigating the history of the Semantic Web itself gives us an idea of the complexity of its general implementation and the problems it generates, to the extent that it has not managed to establish itself as a major technology when compared to other paradigms such as blockchain or natural language processing. However, it does have its very specific specialized uses.

Why Before How

Formal ontology is a fascinating topic. But before we continue, I have to ask, why would someone want to implement it in Obsidian? What ultimate purpose are they trying to achieve with it? The answer to that question determines the very response of how to implement it. Is it just curiosity to push Obsidian to its limits in this aspect? Great, let’s keep experimenting! There are many people who dedicate hours to make Obsidian look and behave like Notion or Logseq, instead of using Notion or Logseq directly. Part of the charm is the flexibility. However, if you’re looking to define your own personal ontology, I wouldn’t bet on doing it with Obsidian.

Obsidian is not the best tool for building a formal ontology

Simply because that is not its purpose. An application designed for note-taking where each node is treated as an object (with a storage format that is a database rather than local files) could achieve that goal more efficiently and straightforwardly. In fact, there are a couple of applications that do just that: https://anytype.io/ does an incredible job with a graph centered around object nodes and formally specified relationships, and https://capacities.io/ also has a specifically object-oriented approach.

What I would do if I had to do it in Obsidian

First, I would ask myself, what do I want to achieve exactly? If I wanted to formally model a knowledge domain, I would definitely use a Python notebook first (again, separation of concerns) to properly prototype the model. Then, I would define a markdown-compatible specification to implement it. For example, I might explore using the markdown attributes logic. However, the brutalist way to do it would be to use plain natural language in the style of an informal ontology. Let me explain.

An important part of an ontology is vocabulary control, which defines the relationships. Based on this simple idea, I would define a specification that allows me to understand when I am describing an ontological predicate between two nodes (notes). In my case, this would be whenever there are two or more notes in the same sentence, because specifying the relationship in prose is a principle of my personal approach. Speculating on this specification, I would define something like:

A sentence is a string of characters that is surrounded by a period and a space or a line break (except at the beginning of the note).
A semantic statement is when two or more [[links]] appear within the same sentence (example: [[a]] is a subset of [[b]]).
A predicate is all the text that appears between the links in a semantic statement and its first words are verbs

In this way, if I documented and controlled the expression “is a subset of” to always write it in the same way, I could extract all the times I have used that predicate with a general search and some regex (I wouldn’t even need Dataview). But again, it all depends on my ultimate purpose. If I want to see in a single document all the predicates I have declared in my vault (and I’m not sure what utility that would have, especially if they are factual predicates like A was born in year N), I would use a combination of the specification with inline fields. However, I am convinced that I would achieve better and more elegant results if I modeled it as a database and used a programming language.