Automatically fetch and import metadata using Open Graph Protocol -> YAML Metadata

DandyLyons · January 9, 2021, 6:54pm

Have you seen how you can paste a URL into Facebook, Twitter or iMessage, and it will automatically find the page name, author, and photo? They do this by reading the Open Graph Protocol tags. Most websites today are already following this protocol and embedding information, all we need is to read it.

See here:

And Obsidian just added support for metadata using YAML. Obsidian can store aliases and tags in YAML metadata but there is potential for so much more. Here are some great discussions of YAML on Obsidian:

Proposal

Allow Obsidian to support the YAML field url:. When a user pastes a url into the url: YAML field, Obsidian fetches the open graph metadata from the url and shows the user any metadata it found.

If an Open Graph field already matches a YAML field that Obsidian or an Obsidian plug-in already understands, then Obsidian just automatically pastes that data into the YAML field.

If Obsidian finds fields that it does not yet recognize or support, then it will show a dialog box with all the extra metadata it found. The user can click any of these open graph tags, type in the name of a YAML field and Obsidian will add it to the field.

Example scenario:

I’m creating an atomic Literature note for a blog post that I read. I create a new note and add my YAML header. I add the YAML field url: and paste in www.someBlog.com/someArticle. Behind the scenes, Obsidian fetches the Open Graph metadata from www.someBlog.com/someArticle. The metadata may look something like this:

article:published_time - datetime
article:author
og:title
og:type
og:url

Let’s say in the future, Obsidian already has support for more YAML fields. In this scenario, Obsidian reads that Open Graph metadata and auto adds the following YAML metadata:

---
url: www.someBlog.com/someArticle
type: article
author: some author
date: published_time - datetime
aliases: Article Title
tags: [articleTag1, articleTag2]
---

Imagine how much faster and more powerful Obsidian could be if we just took advantage of the metadata that is already given to us.

Use Cases

This could greatly reduce “managerial work” particularly for people who take Atomic notes, and Literature notes.

cristian · January 16, 2021, 12:02pm

I’m trying to understand your proposal,

Would this use case be correct?

The user:

Is browsing a page and finds something interesting.
Decides to take a note, for example, a literature note.
User pastes the link into a note, and Open Graph metadata is extracted and pasted

---
url: www.someBlog.com/someArticle
type: article
author: some author
date: published_time - datetime
aliases: Article Title
tags: [articleTag1, articleTag2]
---

The user writes maintaining some context.
In the future, plugins and applications can make use of this metadata.

I like the idea, it can be implemented as a plugin, not only to extract metadata of open graph protocol but also Dublin core, schema.org, etc.

DandyLyons · January 16, 2021, 5:15pm

Yup that’s pretty much my proposal. It’s pretty high level and I don’t know the low level implementation details but that’s the basic concept.

In my understanding, this must be what most other services are doing behind the scene.

cristian · January 16, 2021, 5:40pm

Hi @DandyLyons, I think this is simple to implement if we reuse existing libraries/services.

For example, there is one called ‘Data sniffer’ which is a plugin for the browser that extracts not only Open Graph Protocol but all kinds of semantic markup formats/syntaxes (Microdata, RDFa, etc).

How do you imagine we can take advantage of such metadata?

DandyLyons · January 16, 2021, 11:17pm

In a perfect world, I would say that the best approach is for Obsidian to simply find all of the metadata tags to the user and then the user can decide which tags they want to keep, edit, or throw away. (If the process is completely automatic, then we’ll import junk data that we’ll have to remove manually. More busywork.)

Going above and beyond this is what I think it should look like. (If I had half of these features, I’d be overjoyed):

Obsidian shows a UI with 2 columns.
- Left column is metadata that the user is keeping in this note.
- Right column is metadata that Obsidian found automatically. (Anything in this column will not be kept.)
User can drag metadata tags back and forth between each column
Any tag can be edited changing either the tag name or content
Optionally: If there are certain tags that the user frequently saves, then Obsidian will automatically prepopulate these tags on the left side
After user is done reviewing the tags they can click “Confirm” and obsidian will add all of these changes to the YAML header file.

DandyLyons · January 16, 2021, 11:37pm

How do you imagine we can take advantage of such metadata?

As far as how we can use graph metadata in our PKM, the possibilities are really endless.

But the first kinds of things that spring to my mind are:

using named links as essentially extra powerful tags. For example, imagine being able to search for author:Charles Dickens instead of just #author
All of this metadata could empower users to basically create their own SQL database which can be queried
AI empowered classification and relationship discovery: Imagine if someone creates a plugin that can recreate some of the AI magic found in DEVONThink by automatically suggesting metadata links, and by suggesting related notes

NotesFTW · February 23, 2021, 11:04pm

This would be amazing!

Moonbase59 · March 7, 2021, 10:59am

I use @death.au ’s MarkDownload add-on in my browser which already does some of the above.

Don’t know about Safari, but it’s available for Firefox and Chrome. (Thank you for this!)

Here is what it creates from this post:

---
created: 2021-03-07T12:05:13 (UTC +01:00)
tags: []
source: https://forum.obsidian.md/t/automatically-fetch-and-import-metadata-using-open-graph-protocol-yaml-metadata/11211/4
author: 
---

# Automatically fetch and import metadata using Open Graph Protocol -> YAML Metadata - Feature requests - Obsidian Forum

> ## Excerpt
> A place for Obsidian users to discuss Obsidian and knowledge management

---
[Moonbase59](https://forum.obsidian.md/u/Moonbase59)

[5m](https://forum.obsidian.md/t/automatically-fetch-and-import-metadata-using-open-graph-protocol-yaml-metadata/11211/8?u=moonbase59)

I use [@death.au](https://forum.obsidian.md/u/death.au) ’s _MarkDownload_ add-on in my browser which already does some of the above.

Don’t know about Safari, but it’s available for [Firefox](https://addons.mozilla.org/de/firefox/addon/markdownload/) and [Chrome](https://chrome.google.com/webstore/detail/markdownload-markdown-web/pcmpcfapbekmbjjkdalcgopdkipoggdi?hl=en-GB). (Thank you for this!)

gobbletown · April 3, 2022, 7:10pm

Re-humidifying this…

Did you ever achieve anything like this essential and still-wholly-missing (it seems) function within Obsidian?

DandyLyons · April 6, 2022, 6:36pm

Sorry I haven’t found a way to do this yet.