Automatically fetch and import metadata using Open Graph Protocol -> YAML Metadata

Have you seen how you can paste a URL into Facebook, Twitter or iMessage, and it will automatically find the page name, author, and photo? They do this by reading the Open Graph Protocol tags. Most websites today are already following this protocol and embedding information, all we need is to read it.

See here:

And Obsidian just added support for metadata using YAML. Obsidian can store aliases and tags in YAML metadata but there is potential for so much more. Here are some great discussions of YAML on Obsidian:

Proposal

Allow Obsidian to support the YAML field url:. When a user pastes a url into the url: YAML field, Obsidian fetches the open graph metadata from the url and shows the user any metadata it found.

If an Open Graph field already matches a YAML field that Obsidian or an Obsidian plug-in already understands, then Obsidian just automatically pastes that data into the YAML field.

If Obsidian finds fields that it does not yet recognize or support, then it will show a dialog box with all the extra metadata it found. The user can click any of these open graph tags, type in the name of a YAML field and Obsidian will add it to the field.

Example scenario:

Iā€™m creating an atomic Literature note for a blog post that I read. I create a new note and add my YAML header. I add the YAML field url: and paste in www.someBlog.com/someArticle. Behind the scenes, Obsidian fetches the Open Graph metadata from www.someBlog.com/someArticle. The metadata may look something like this:

article:published_time - datetime
article:author
og:title
og:type
og:url

Letā€™s say in the future, Obsidian already has support for more YAML fields. In this scenario, Obsidian reads that Open Graph metadata and auto adds the following YAML metadata:

---
url: www.someBlog.com/someArticle
type: article
author: some author
date: published_time - datetime
aliases: Article Title
tags: [articleTag1, articleTag2]
---

Imagine how much faster and more powerful Obsidian could be if we just took advantage of the metadata that is already given to us.

Use Cases

This could greatly reduce ā€œmanagerial workā€ particularly for people who take Atomic notes, and Literature notes.

15 Likes

Iā€™m trying to understand your proposal,

Would this use case be correct?

The user:

  1. Is browsing a page and finds something interesting.
  2. Decides to take a note, for example, a literature note.
  3. User pastes the link into a note, and Open Graph metadata is extracted and pasted
---
url: www.someBlog.com/someArticle
type: article
author: some author
date: published_time - datetime
aliases: Article Title
tags: [articleTag1, articleTag2]
---
  1. The user writes maintaining some context.
  2. In the future, plugins and applications can make use of this metadata.

I like the idea, it can be implemented as a plugin, not only to extract metadata of open graph protocol but also Dublin core, schema.org, etc.

3 Likes

Yup thatā€™s pretty much my proposal. Itā€™s pretty high level and I donā€™t know the low level implementation details but thatā€™s the basic concept.

In my understanding, this must be what most other services are doing behind the scene.

Hi @DandyLyons, I think this is simple to implement if we reuse existing libraries/services.

For example, there is one called ā€˜Data snifferā€™ which is a plugin for the browser that extracts not only Open Graph Protocol but all kinds of semantic markup formats/syntaxes (Microdata, RDFa, etc).

How do you imagine we can take advantage of such metadata?

1 Like

In a perfect world, I would say that the best approach is for Obsidian to simply find all of the metadata tags to the user and then the user can decide which tags they want to keep, edit, or throw away. (If the process is completely automatic, then weā€™ll import junk data that weā€™ll have to remove manually. More busywork.)

Going above and beyond this is what I think it should look like. (If I had half of these features, Iā€™d be overjoyed):

  • Obsidian shows a UI with 2 columns.
    • Left column is metadata that the user is keeping in this note.
    • Right column is metadata that Obsidian found automatically. (Anything in this column will not be kept.)
  • User can drag metadata tags back and forth between each column
  • Any tag can be edited changing either the tag name or content
  • Optionally: If there are certain tags that the user frequently saves, then Obsidian will automatically prepopulate these tags on the left side
  • After user is done reviewing the tags they can click ā€œConfirmā€ and obsidian will add all of these changes to the YAML header file.

How do you imagine we can take advantage of such metadata?

As far as how we can use graph metadata in our PKM, the possibilities are really endless.

But the first kinds of things that spring to my mind are:

  • using named links as essentially extra powerful tags. For example, imagine being able to search for author:Charles Dickens instead of just #author
  • All of this metadata could empower users to basically create their own SQL database which can be queried
  • AI empowered classification and relationship discovery: Imagine if someone creates a plugin that can recreate some of the AI magic found in DEVONThink by automatically suggesting metadata links, and by suggesting related notes
2 Likes

This would be amazing!

I use @death.au ā€™s MarkDownload add-on in my browser which already does some of the above.

Donā€™t know about Safari, but itā€™s available for Firefox and Chrome. (Thank you for this!)

Here is what it creates from this post:

---
created: 2021-03-07T12:05:13 (UTC +01:00)
tags: []
source: https://forum.obsidian.md/t/automatically-fetch-and-import-metadata-using-open-graph-protocol-yaml-metadata/11211/4
author: 
---

# Automatically fetch and import metadata using Open Graph Protocol -> YAML Metadata - Feature requests - Obsidian Forum

> ## Excerpt
> A place for Obsidian users to discuss Obsidian and knowledge management

---
[Moonbase59](https://forum.obsidian.md/u/Moonbase59)

[5m](https://forum.obsidian.md/t/automatically-fetch-and-import-metadata-using-open-graph-protocol-yaml-metadata/11211/8?u=moonbase59)

I use [@death.au](https://forum.obsidian.md/u/death.au) ā€™s _MarkDownload_ add-on in my browser which already does some of the above.

Donā€™t know about Safari, but itā€™s available for [Firefox](https://addons.mozilla.org/de/firefox/addon/markdownload/) and [Chrome](https://chrome.google.com/webstore/detail/markdownload-markdown-web/pcmpcfapbekmbjjkdalcgopdkipoggdi?hl=en-GB). (Thank you for this!)
2 Likes

Re-humidifying thisā€¦

Did you ever achieve anything like this essential and still-wholly-missing (it seems) function within Obsidian?

1 Like

Sorry I havenā€™t found a way to do this yet.