Using Hypothes.is to quickly place notes into my digital notebook

chrisaldrich · August 29, 2020, 6:08pm

This could also be categorized as a feature request/plugin idea, but has an immediately useful hack at the bottom. (Originally posted to my own commonplace book at https://boffosocko.com/2020/08/29/a-note-taking-problem-and-a-proposed-solution/)

tl;dr: It’s too painful to quickly get frequent notes into note taking and related platforms. Hypothes.is has an open API and a great UI that can be leveraged to simplify note taking processes.

Note taking tools

I’ve been keeping notes in systems like OneNote and Evernote for ages, but for my memory-related research and work in combination with my commonplace book for the last year, I’ve been alternately using TiddlyWiki (with TiddlyBlink) and WordPress (it’s way more than a blog.)

I’ve also dabbled significantly enough with related systems like Roam Research, Obsidian, Org mode/Org Roam, MediaWiki, DocuWiki, and many others to know what I’m looking for.

Many of these, particularly those that can be used alternately as commonplace books and zettelkasten appeal to me greatly when they include the idea of backlinks. (I’ve been using Webmention to leverage that functionality in WordPress settings, and MediaWiki gives it grudgingly with the “what links to this page” basic functionality that can be leveraged into better transclusion if necessary.)

The major problem with most note taking tools

The final remaining problem I’ve found with almost all of these platforms is being able to quickly and easily get data into them so that I can work with or manipulate it. For me the worst part of note taking is the actual taking of notes. Once I’ve got them, I can do some generally useful things with them—it’s literally the physical method of getting data from a web page, book, or other platform into the actual digital notebook that is the most painful, mindless, and useless thing for me.

Evernote and OneNote

Older note taking services like Evernote and OneNote come with browser bookmarklets or mobile share functionality that make taking notes and extracting data from web sources simple and straightforward. Then once the data is in your notebook you can actually do some work with it. Sadly neither of these services has the backlinking functionality that I find has become de rigueur for my note taking or knowledge wrangling needs.

WordPress

My WordPress solutions are pretty well set since that workflow is entirely web-based and because WordPress has both bookmarklet and Micropub support. There I’m primarily using a variety of feeds and services to format data into a usable form that I can use to ping my Micropub endpoint. The Micropub plugin handles the post and most of the meta data I care about.

It would be great if other web services had support for Micropub this way too, as I could see some massive benefits to MediaWiki, Roam Research, and TiddlyWiki if they had this sort of support. The idea of Micropub has such great potential for great user interfaces. I could also see many of these services modifying projects like Omnibear to extend themselves to create highlighting (quoting) and annotating functionality with a browser extension.

With this said, I’m finding that the user interface piece that I’m missing for almost all of these note taking tools is raw data collection .

I’m not the sort of person whose learning style (or memory) is benefited by writing or typing out notes into my notebooks. I’d far rather just have it magically happen. Even copying and pasting data from a web browser into my digital notebook is a painful and annoying process, especially when you’re reading and collecting/curating as many notes as I tend to. I’d rather be able to highlight, type some thoughts and have it appear in my notebook. This would prevent the flow of my reading, thinking, and short annotations from being subverted by the note collection process.

Different modalities for content consumption and note taking

Based on my general experience there are only a handful of different spaces where I’m typically making notes.

Reading online

A large portion of my reading these days is done in online settings. From newspapers, magazines, journal articles and more, I’m usually reading them online and taking notes from them there.

.pdf texts

Some texts I want to read (often books and journal articles) only live in .pdf form. While reading them in an app-specific setting has previously been my preference, I’ve taken to reading them from within browsers. I’ll explain why in just a moment, but it has to do with a tool that treats this method the same as the general online modality. I’ll note that most of the .pdf specific apps have dreadful data export—if any.

Reading e-books (Kindle, e-readers, etc.)

If it’s not online or in .pdf format, I’m usually reading books within a Kindle or other e-reading device. These are usually fairly easy to add highlights, annotations, and notes to. While there are some paid apps that can extract these notes, I don’t find it too difficult to find the raw file and cut and paste the data into my notebook of choice. Once there, going through my notes, reformatting them (if necessary), tagging them and expanding on them is not only relatively straightforward, but it also serves as a simple method for doing a first pass of spaced repetition and review for better long term recall.

Lectures

Naturally taking notes from live lectures, audiobooks, and other spoken events occurs, but more often in these cases, I’m typically able to type them directly into my notebook of preference or I’m using something like my digital Livescribe pen for notes which get converted by OCR and are easy enough to convert in bulk into a digital notebook. I won’t belabor this part further, though if others have quick methods, I’d love to hear them.

Physical books

While I love a physical book 10x more than the next 100 people, I’ve been trying to stay away from them because I find that though they’re easy to highlight, underline, and annotate the margins, it takes too much time and effort (generally useless for memory purposes for me) to transfer these notes into a digital notebook setting. And after all, it’s the time saving piece I’m after here, so my preference is to read in some digital format if at all possible.

A potential solution for most of these modalities

For several years now, I’ve been enamored of the online Hypothes.is annotation tool. It’s open source, allows me reasonable access to my data from the (free) hosted version, and has a simple, beautiful, and fast process for bookmarking, highlighting, and annotating online texts on desktop and mobile. It works exceptionally well for both web pages and when reading .pdf texts within a browser window.

I’ve used it daily to make several thousand annotations on 800+ online web pages and documents. I’m not sure how I managed without it before. It’s the note taking tool I wished I’d always had. It’s a fun and welcome part of my daily life. It does exactly what I want it to and generally stays out of the way otherwise. I love it and recommend it unreservedly. It’s helped me to think more deeply and interact more directly with countless texts.

When reading on desktop or mobile platforms, it’s very simple to tap a browser extension and have all their functionality immediately available. I can quickly highlight a section of a text and their UI pops open to allow me to annotate, tag it, and publish. I feel like it’s even faster than posting something to Twitter. It is fantastically elegant.

The one problem I have with it is that while it’s great for collecting and aggregating my note data into my Hypothes.is account, there’s not much I can do with it once it’s there. It’s missing the notebook functionality some of these other services provide. I wish I could plug all my annotation and highlight content into spaced repetition systems or move it around and modify it within a notebook where it might be more interactive and cross linked for the long term. Sadly I don’t think that any of this sort of functionality is on Hypothes.is’ roadmap any time soon.

There is some great news however! Hypothes.is is open source and has a reasonable API. This portends some exciting things! This means that any of these wiki, zettelkasten, note taking, or spaced repetition services could leverage the UI for collecting data and pipe it into their interfaces for direct use.

As an example, what if I could quickly tell Obsidian to import all my pre-existing and future Hypothes.is data directly into my Obsidian vault for manipulating as notes? (And wouldn’t you know, the small atomic notes I get by highlighting and annotating are just the sort that one would like in a zettelkasten!) What if I could pick and choose specific course-related data from my reading and note taking in Hypothes.is (perhaps by tag or group) for import into Anki to quickly create some flash cards for spaced repetition review? For me, this combination would be my dream application!

These small pieces, loosely joined can provide some awesome opportunities for knowledge workers, students, researchers, and others. The education focused direction that Hypothes.is, many of these note taking platforms, and spaced repetition systems are all facing positions them to make a super-product that we all want and need.

An experiment

So today, as a somewhat limited experiment, I played around with my Hypothes.is atom feed (https://hypothes.is/stream.atom?user=chrisaldrich, because you know you want to subscribe to this) and piped it into IFTTT. Each post creates a new document in a OneDrive file which I can convert to a markdown .md file that can be picked up by my Obsidian client. While I can’t easily get the tags the way I’d like (because they’re not included in the feed) and the formatting is incredibly close, but not quite there, the result is actually quite nice.

Since I can “drop” all my new notes into a particular folder, I can easily process them all at a later date/time if necessary. In fact, I find that the fact that I might want to revisit all my notes to do quick tweaks or adding links or additional thoughts provides the added benefit of a first round of spaced repetition for the notes I took.

Some notes may end up being deleted or reshuffled, but one thing is clear: I’ve never been able to so simply highlight, annotate, and take notes on documents online and get them into my notebook so quickly. And when I want to do something with them, there they are, already sitting in my notebook for manipulation, cross-linking, spaced repetition, and review.

So if the developers of any of these platforms are paying attention, I (and I’m sure others) really can’t wait for plugin integrations using the full power of the Hypothes.is API that allow us to all leverage Hypothes.is’ user interface to make our workflows seamlessly simple.

Klaas · August 30, 2020, 9:57am

How’s privacy on Hypothes.is? Can any Tom, Dick, and Harry access your notes?

JkNML · August 30, 2020, 2:47pm

I think this is actually bad news. Because this depends on other software to make connections to Hypothes.is. Which takes time, money, and isn’t always feasible. Why doesn’t Hypothes.is make it easier to export data out of their platform?

I didn’t read your entire post (way too long and marketing-like), but if Hypothes.is makes you reliant on your platform then I personally wouldn’t be comfortable with that.

Good point. I couldn’t find, but did notice that Hypothes.is has the right to remove notes they don’t approve of:

Klaas · August 30, 2020, 3:00pm

So, they can, AND the do look into people’s notes, even if it is done by a machine.

That is a deal breaker for me for sure, as it is for Roam. I have other issues with Roam (never even tried it, BTW), but even if all those issues were resolved I would not touch it as long as there is no privacy.

JkNML · August 30, 2020, 3:13pm

I agree. The way I understand how annotations (their notes) work on Hypothes.is, is that they aren’t private – they are public by default, and you can set them to ‘private’.

But if you do that in a ‘private’ group and someone shares the link of that group (on social media for instance), other people (who weren’t invited) can read your annotations too.

That’s at least how I understand the Who can see my annotations? page.

Klaas · August 30, 2020, 3:16pm

It is not so much the option to set your notes to “private” that counts for me. It is the impossibility to hide them from Hypothes.is personnel/machines.

chrisaldrich · August 30, 2020, 4:13pm

There are workable options for all of these criticisms. I would suggest that @Klaas, @JkNML, and others not dismiss the platform so quickly out of hand.

In my experience, the Hypothes.is team—working on a non-profit, education-based platform—are some of the most ethical designers and developers I’ve run across in a long time. They’re not doing some of the highly questionable things with people’s data that one sees many venture backed or for-profit companies doing.

Hypothes.is generally does make it easy to have one’s data and there are many methods in addition to their API. However, this does take a bit of technical work, particularly if you want it as a stream of data in real time, which is why I would advocate for that sort of functionality to be made into a plugin for projects like Obsidian. They could likely create the support in a few hours for the benefit of their thousands of users who are far less likely to do that work individually.

There are two broad “versions” of Hypothes.is (and possibly others in the wild).

One is the publicly run instance that offers their service for free (with some universities and educational institutions paying for additional levels of support and integration as a means of supporting their enterprise). Because the education market is their primary business, they take privacy and security of users data very seriously. (This is also primarily why they indicate in their terms of service that they can delete data that is abusive or outside of societal norms. They’re doing that to protect against abuses in school settings. I’ve not heard any cases of them having done so.)

All of one’s annotations can be marked as either private or public with the default being private. If they’re marked public, then they go into their “public” timeline, but in many years of experience doing primarily only public annotations, I suspect that very few people are reading, much less commenting on my annotations. (Part of this is because of the niche, academic nature of the platform.) They do also offer private groups—primarily for classrooms—so that students could annotate pages with their teachers and classmates in a private setting. Naturally this could be screenshot and shared, but I haven’t seen or heard of this sort of leakage in the wild. And of course the average user may never need or want a private group anyway.

And of course if this doesn’t ameliorate your level of concern, the second “version” of the project is an open source one so you have the option to download it and run it on your own server. This way, if you don’t like the company’s policy on data, privacy, or other options you can manage them for yourself.

All this said, they’ve put a huge amount of engineering and design work into making their platform work seamlessly on the majority of the web. This is something that any note-taking project should love to take advantage of and support. It’s unlikely that a project like Obsidian would be able to devote the time, effort, staff, or other resources to support this sort of functionality themselves.

Apologies if my piece seemed like an advertisement to some, I hope it shouldn’t. I definitely spent longer laying out the problem, but my goal was to bring everyone up to speed on my viewpoint of a very specific UI problem that I (and many others) face and the details of a very specific, flexible, and elegant solution, which many users are unlikely to have the level of experience with that I do.

The other “hidden” portion relating to Webmention is also incredibly useful in that it could also apply to the thousands of social media and other online platforms that users use to generate data, and which could be used to pipe that easily into a personal notebook for direct use. As an example, a bookmarking service like Pinboard is nice and has a useful tagging and search function, but is less usable to me because none of that data is usable the way it would be in a bi-directionally linked setting like Obsidian. I also may need to search multiple platforms (Pocket, Pinterest, Instapaper, Diigo, etc.) to remember which one I saved a particular web page to. Imagine if all of the online data and content you generated over decades were available in a personal private/public notebook?

Solari · September 24, 2020, 7:45pm

Love this idea! I’ve starting borrowing your workflow for Hypothesis using IFTT to drop new notes into my Obsidian inbox.

It works surprisingly well and works great across platforms (since I use Mac/iPhone but have an Android e-reader tablet for heavy reading stuff).

It can also be used as a web clipper by highlighting stuff you want to save for later.

Thanks for the inspiration, @chrisaldrich.
Ray

michael_rowe · December 1, 2020, 6:44pm

Thanks for this suggestion. I’ve just created the IFTTT recipe and, while it’s not 100% what I’m looking for, it’s the closest thing I’ve found to getting me there. Much appreciated.

JAG · January 14, 2021, 1:32am

@Solari I also own an Onyx Boox Android eReader and would love to know how you use Hypothes.is with your e-reading device. I have been using Instapaper and have found it very underwhelming.

drgraham · February 16, 2021, 3:58pm

Inspired by all this, and wanting to see if I could come up with something that didn’t use IFTTT, I searched Github to see what kind of Hypothesis export scripts were out there, and I’ve cobbled together a workflow (blogged here ). What follows was created in the context of working on a Mac; PC will be broadly similar.

Basically, if you have python installed on your machine:

Get your annotations

Install ‘Hypexport’ from GitHub - karlicoss/hypexport: Export/access your Hypothes.is data: annotations and profile info . Install it with

pip3 install --user git+https://github.com/karlicoss/hypexport

Then, create a new text file; call it secrets.py and put into it your Hypothesis username and your developer token (which is underneath your username when you have hypothesis open) like so:

username = "USERNAME"
token = "TOKEN"

Save that file.

Now, you can grab all of your annotations with:

python3 -m hypexport.export --secrets secrets.py > annotations.json

Turn json into markdown

Now we need to turn that json into markdown.

( Incidentally, if you want to turn your annotations into a csv, get jq and run something like this

jq -r '.annotations[] | [.text, .tags, .updated, .uri] | @csv' annotations.json > annotations.csv

)

So, here’s a json to markdown script: GitHub - PolBaladas/torsimany: 💡✏️️ ⬇️️ JSON to Markdown converter - Generate Markdown from format independent JSON . Pip install that:

pip install torsimany

but then find where it’s located on your machine (search for torsimany.py) and change this line

data = f.read().decode('ascii', 'ignore')

to just

data = f.read()

and then run

torsimany annotations.json

at the command prompt (in the folder where you downloaded your annotations to), and after a bit you’ll have a file called annotations.markdown. (Incidentally, you can modify the torsimany.py script with your own preferences for how the different elements should be rendered as markdown, if you don’t like the defaults).

Split into individual markdown files

Last thing – we want to split that up into separate markdown files, to drop into the obsidian vault. cpslit, split, awk, etc, all of those things will probably work; here’s some perl (Mac’s have perl installed by default). Copy it into a text file, save with .pl, and if you’re on a mac, change its permissions with

chmod +x split.pl

so you can run it. Here’s the file (sourced from stackoverflow):

#!/usr/bin/perl

undef $/;
$_ = <>;
$n = 0;

for $match (split(/(?=### Title)/)) {
      open(O, '>temp' . ++$n);
      print O $match;
      close(O);
}

then run

./split.pl annotations.markdown

and you’ll have a whoooole lotta files you can drop into your obsidian vault. Ta da!

Fix the lack of file extensions

Now, you’ll have to add the .md file extension, which can be done as a batch with this one liner on a mac:

for file in *; do mv "$file" "${file%}.md"; done

and there we go.

So, not perhaps as elegant as using IFTTT, but you can see every step in the process. I’ll eventually get around to making this all just a single script, but I’m learning as I go. No doubt there are far more elegant ways of making this all work.

Also, it is possible to install your own hypothesis server so that you’re not part of the larger Hypothesis system; I intend to figure that out eventually and will post it on my blog if ever I do. I am personally satisfied with how Hypothesis treats user data, but I understand the concerns of folks who might like more control.

nccollignon · February 24, 2021, 9:43pm

thanks for sharing this! was hoping someone would do it

nccollignon · February 26, 2021, 10:30pm

I’ve written some code that formats the title, adds tags (when they exist) and joins highlights and annotations onto a single markdown file if that’s of interest.

github repo here.

will add some stuff so it can be updated without duplicating previously added notes.

drgraham · March 16, 2021, 3:09pm

Nice!

zendude · March 23, 2021, 6:41pm

Hi all. Like everyone here I’m an obsidian user but I’ve also been using Remnote for my Spaced Repetition workflows and have decided to use hypothes.is as the means of getting content into remnote for further refinement. There’s a way to do this via IFTTT and Webhooks, my problem in the case of hypothes.is is that it seems to only work for public annotations. Would anyone know how to make this workflow work on private ones? I just need to be able to create an RSS for the latter and it should work with my Remnote webhook… Thanks for any ideas!

Erisred · March 23, 2021, 6:54pm

@zendude - I’ve not found anything simple. They all include scripting and server-side setups. I believe they all include warnings that your security is exposed when using these methods.

If you’re interested, here are two I’ve bookmarked in case I get the itch to try them out.
This one is the method Hypothes.is points to from their site.
…and see this fairly detailed setup from the forums.

If you have any success, I’d be happy to hear of it. Good luck!

ninjani · March 23, 2021, 7:02pm

This reminded to type up some recent Discord messages about hypothes.is markdown export:

Facet allows exporting annotations to HTML - You search for your username, select the group (if it’s private you’d need an API token generated from https://hypothes.is/account/developer) and that lists all your annotations which you can then export as HTML. With the markdownload browser plugin, you can right-click copy tab as markdown.

Also, quick plug for gooseberry for command-line users - bulk tagging + export with customizable folder structure and markdown templates.

zendude · March 23, 2021, 7:45pm

thanks all. I tried Facet’s python which I think was suppose to generate an XML file which I coul’dve then turned into an RSS and then via Webhook to Remnote. unfortunately it didn’t work. as for the rest, it doesn’t seem to do it in a (for a lack of better word) “dynamic” kind of way ala Readwise… where when annotating something new, it automatically creates a corresponding page that syncs your annotations in readwise and then roam. I prefer the extracts to go directly to Remnote, Obsidian, etc… since I have a habit of forgetting that I’ve annotated something at all when it’s not in my work inboxes

drgraham · June 23, 2021, 2:15am

Folks have seen this? Uses templates plugin , works nicely.

Expeditioner · July 6, 2021, 11:35am

Can we export annotations through RSS and IFTTT if the notes are posted privately ? What if the annotations are saved to a particular group ?