Plaintext bookmarks?

robertandrews · July 31, 2022, 7:22am

How would you approach the idea of a “plaintext bookmark”? Meaning, a future-proof plaintext/Markdown file containing the title, URL and any other available details of a web page.

Is there any such convention for this already?

Rationale:

Dragged shortcut files may no longer be readable one day.
I am experimenting with capturing reference material to Notion and the Raindrop bookmarks manager. They’re great, but the longevity problem is greater.

Idea:

I think it would be possible for me to write a script/automation harnessing the API of either app to write the bookmark details out into plaintext files.

They could even include other fields - highlights, notes, excerpt and maybe even the original extracted text of the source article - since Raindrop and Notion can clip those. (Indeed, the source web page may no longer be available one day).

Other fields may include the bookmark collection and tags.

Uncertainty:

But how would we construct such a file in a meaningful way, one that would not only include the plaintext content but mabe have some utility in metadata, some usability inside Obsidian or similar?

Should I be looking at YAML? CSV? Maybe I sort of like the idea of opening up an individual file…

Or…

Looked at another way (the present tense rather than the future), I suppose this is sort of about creating a literature/reference note since, if I add in those other fields - highlights, notes, excerpt, content, collection, tags, date - the file is something a bit more expressive.

So is this question a crossover with the idea of an Obsidian (web) clipper? It also makes me think a little bit of flat-file/plaintext CMSes.

patleeman · October 22, 2022, 12:27pm

I just came across this while thinking about this problem.

I’m thinking the most extensible way to do this would be the YAML frontmatter in a markdown file:

---
url: https://www.example.com
tags: ['tag', 'bookmark']
created: <timestamp>
---

<Either notes or markdown converted webpage>

Archie · October 22, 2022, 12:53pm

I thought about this issue a lot, the original way browsers handle bookmarks is only url and a name which is not good, Raindrop and OneNote and stuff like that are not needed , they are not local first like obsidian anyway.

What I do now is to capture the webpages which I have a lot of notes I like to right via Webclipper bookmarklet, but I don’t consider them bookmarklets but more of a capture in my book as they have their own note with a YAML front matter.

For other pages which i don’t have a lot of notes I store them as Tasks with metadata (I consider these as bookmarks) so I can search for them later, the metadata I use right now are title, url and note, (which has a lot of room for improvement). I use another JS bookmarklet to capture links into a temp note, then after a while i categorize them into files and store them into separate notes. The JS Script I copy here. I like to see some improvement so any modification or suggestions are welcomed.

javascript:{ 
const vault = "<Vault name here>";
const fileName = "<Note name here>";
const folder = "<relative path to folder inside the vault here>";
  

if (vault) {
        var vaultName = '&vault=' + encodeURIComponent(`${vault}`);
} else {
        var vaultName = '';
}

function getInvalidOut(aFileName) {

    aFileName = aFileName.replace(':', '').replace(/[/\\?%*|"<>\[\]]/g, '-');
    return aFileName;
}
   
const titleClean = getInvalidOut(document.title);

let note = window.prompt("Note", "");

const fileContent = "\n- [ ] " + "((mdlink::[" + titleClean + "](" + document.URL + "))) - ((note::" + note +  ")) "; 

  
document.location.href = 
"obsidian://new?"
+ "append&silent"
+ "&file=" + encodeURIComponent(folder + fileName)
+ "&content=" + encodeURIComponent(fileContent)
+ vaultName;
}

robertandrews · October 25, 2022, 3:23pm

Thanks.
I’m coming to your way of thinking, that such a capture should include part/all of the source material. This becomes a capture for posterity, not just a pointer.

So, what is the content of the pages captured using your bookmarklet/web clipper?

I’m used to clipping resources with Notion and like that this captures page body content - although PKM people will say that’s cheating.

I have been trying out Raindrop for a while now. It’s quite nice as a bookmark manager, it’s cross-platform, works well on mobile, supports highlights of multiple colours (annotations only at Pro, I think) and I think stores full body text content at Pro level.

There is an Obsidian plugin for Raindrop now. It can pull content from each of your Raindrop bookmarks and you can specify a custom format made up of the different fields, including your highlights. It currently lacks the ability to return full page text.

I haven’t fully adopted this workflow yet, but it seems like it would allow me to use a dedicated app for what it’s used for, retaining its own use and integrity outside of Obsidian, too.

robertandrews · October 25, 2022, 3:25pm

Thanks for the suggestion.
I think it would be possible for me to output such a thing into Raindrop bookmarks I can pull to Obsidian using this plugin…

fynn · October 26, 2022, 3:30pm

Not really much to add, just unordered thoughts, but I do find the topic interesting, so here goes:

If you look at a bookmark just as a reference to a resource, then the URL should do you fine. Maybe add, title, favicon and some metadata like the creation date, and you should be good.

But if you need to be sure you can access the bookmarked content later, capturing the page is important, because if you don’t, then you set yourself up for link rot, and your bookmark becomes worthless if the content goes down, or even just changes location.

If you go down that road, you might want to consider two problems

How/if to handle synchronisation.

Because for something like a wiki page, depending on the topic, your bookmark might go stale relatively fast.
External linked Resources.

If you want to keep the experience, you also need to get linked stylesheets, fonts, javascript, images etc. and that sounds hard.
Parsing the content.

If you are just interested in some content on the page, and not the “complete experience” then the way, you get that would need to be considered, but it seems, that you have that figured out, by using the clipping features of Raindrop and Notion.

As for the technology to use, if you have a set amount of fields, that you are interested in, CSV would probably be the easiest choice, but if you want to have an extensible format, that can stay backwards compatible, or just include structured data, then you should definitely go for YAML (or JSON, but that’s ugly from a human readability standpoint). Intuitively, I would spring for YAML, but that is maybe just because the requirements are so loose as of now.

I wish you good luck in your endeavor, because a more long-lived way of storing such information would be very nice indeed!

Archie · October 27, 2022, 5:42am

Raindrop is an interesting app, I tried it for a bit in past, didn’t know that there is now a obsidian plugin to import from it. I may give it another try the, altho I don’t use phone that much for online note taking as I am not comfortable without a proper keyboard.

I am not sure about what do you mean by: “what is the content of the pages captured using your bookmarklet/web clipper?”

I do collect some metadata from pages I save these based on their format, It can be the whole page or just a selection of interesting part of it, as far as metadta is concerned (which I guess what you were asking) it is: author, publisher, publishdate, creationdate, duration (in words for text and in seconds for audio/vidoe) and adding some notes and tags to it. But I do this kind of more detailed metadata only for the pages that I consider “Read”, gathering those stuff for everypage I see and consider to read it later on the web is not simply practical for me.

I have more automation for these kinds of links, like for vidoes the bookmarklet can get many metadata from the page, like the channel name , publishdate and duration. I think using one line task format of saving data on pages is the best for the audio and vidoes because I can still add the notes I wrote about them in at the same line. It is harder to do the same with text files but it is doable if the passages are very short.

Here is an example of the final product, to make it more clear. All of these metadata is searchable via dataview.

[x] Prolonged Fasting and Keto to Starve Cancer (title:: Prolonged Fasting and Keto to Starve Cancer) (link:: Prolonged Fasting and Keto to Starve Cancer - YouTube) By (authors:: ) - (authors_dv:: ) From (channel:: The Carnivore Lion) - (channel_dv:: ) C@ (created:: 2022-10-26) P@ (pubdate:: 2022-10-26) Dur (duration:: 470s) NeedNote()Y()N() BM (bookmark:: 00:01:21 )LT (linkedto:: ) Note: (note:: ) Sum (summary:: ) TS (timestamps::
- His name is JJ Trochon
- 00:01:56: He has a book on it
- 00:03:20: Mentions “Walter Moor”) #todo #/rwl/media/video #/link

system · January 25, 2023, 5:42am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.