Hey @Nebucatnetzer,
That is how I do keep my stuff. I do convert articles directly into .md-files though. Some of my older archives are HTML (which is text also :wink: )
With .md one looses a little bit of lay-out but that is in my opinion not what knowledge is about. The information needs to be there :wink:
Today I did some work related research on my biggest vault (85k+ articles). Managed to get the demanded information out in a formatted document under five minutes. Added some comments and (hopefully) useful anchor-points to get the discussion going and start evolving to a good basis for an in-depth discussion.

1 Like

How do you manage the pictures in an article?

A plugin like MarkDownload keeps images as references to web addresses, so they will display fine so long as those images addresses don’t change. For my usage this is fine as I mostly am downloading to archive the text.

How do you save them on mobile? If I may ask so directly :slight_smile:.

I use Firefox share to desktop function, and then save there later.

Good question!
Pandoc lets you make a standalone html (selfcontained) with everything included.
For .md I haven’t found an automatic flow yet to go from URL to ready to use files.
Still some manual stuff to get the links OK (to the pictures). For me that is not an option.
When I found an automatic workflow I will certianly share!

1 Like

I keep them in any app that helps me keeping URL for a “read later”.
That list can then be used to make your .md-files on your laptop.

Thinking of it:
You could create an .md (Dropbox, OneDrive, …) with those URL links and work your way from there. Daily notes .md is also a good starting point.

You wouldn’t even have to use Pandoc to get a self contained HTML file.
There are various plugins which let you save a website as an MHTML file.
E.g. https://addons.mozilla.org/en-CA/firefox/addon/save-page-we/?src=search for both Firefox and Chrome.

However I’m not sure how good those HTML files will work at a later point in time.

As for the automation, if someone doesn’t mind using proprietary services like Dropbox, etc. I’m sure it’s possible to come up with a workflow in IFTTT to convert a website to Markdown and save it into a Dropbox account.

Hence the use of Pandoc :wink:

I use this as well quite a often. Especially when I’m past just reading the article and want to work with it.

So Pandoc creates an HTML file with just the content and no CSS?

Or CSS included in the HTML (what I would understand when I read self-contained).
I already use Pandoc for years now. I even use it to archive my mail messages permanently as .HTML and .TXT files. As said I haven’t found a good way to put them in .md files without losing too much of reading comfort.

Okay that would work the same way the plugins do then.
I’m not concerned that the plugins will stop working but that the website won’t be rendered correctly because the technology moves forwards and might break something.
Which would still be the case with Pandoc then.

If you are concerned about that you shouldn’t render in HTML but in PDF for example.
Don’t know if Pandoc can render PDF-A which should guarantee the PDF usable in an archive setting.

Do you have pandoc set up as an app or command-line tool? If command-line what options are you using to generate your offline version of a webpage? I’ve used pandoc to convert a few files between formats, but not to convert a webpage before.

Thanks in advance :smiley:


All the crazy pandoc options can be found here: https://pandoc.org/MANUAL.html

Thank you for the info. I had seen that in the --help, but didn’t associate it with web pages.

I’ll give it a try.

1 Like

can you please share your notes snip here as i am facing issue of properly using obsidian for news and eco sources…

is this the right way to take newspaper notes in obsidian or i am not doing the right thing here.

Please help as I am trying to build a system wherein I can use my notes to review and link facts to the bigger picture for proper understanding so that it may be useful for my MCQ and essay based exams…

Two suggested workflows (I do both, depending on situation):

  1. Download full article into Obsidian vault using MarkDownload
  2. Highlight full MD articles within Obsidian.
  3. Extract highlights into the relevant note on the given topic (and add highlights from any relevant source). The footnote option can make sure the highlights automatically link back to the source.
  4. Then you’ll have source notes (original source article in full, with highlights) plus your own notes (highlights sorted by topic or event).

Alternatively, if you don’t need to save the entire full article, but just a web link, try using the roam-highlighter extension for Firefox and Chrome for step 1/2

1 Like