Speech-to-text in Obsidian using OpenAI Whisper Service

The same people who brought us chatGPT have also made an AI speech-to-text service that is rather amazing. It understands multiple languages, cost-effective & fast.

In this article, I profile an Obsidian plugin by https://twitter.com/nikdanilov using the Whisper Service


This is a really fantastic plugin! Thanks for this. Would you have any idea how I can transcribe longer recordings I already made in .mp4 format?

Good question. I would suggest posting it as a feature request at the developer’s GitHub repo for this plugin: Issues · nikdanilov/whisper-obsidian-plugin · GitHub

Another option is this plugin, which seems to do what you want but I am unsure if there are additional charges. GitHub - djmango/obsidian-transcription: Obsidian plugin to create high-quality transcriptions from markdown linked audio files.

I hope you find a good solution, I see a few have been interested in the same thing.

Many thanks! I will post the feature request.

regarding the /obsidian-transcription: This requires signing up to https://scribe.gambitengine.com which I do not exactly understand why this is required as direct use of OpenAI API is possible to access Whisper (I just don’t know how).

You don’t know how to do what?

Many thanks for your question. I did not know how to upload files to Whisper directly using my personal API.

In the meantime, I have found it amazingly easy to install Whisper locally on my Mac and to run transcriptions in Terminal via this instruction. Install Whisper.cpp on your Mac in 5mn and transcribe all your podcasts for free!.

I would love to see this as an Obsidian plugin, but in the mean time have found a good alternative solution.

I used Obsidian “Whisper: Upload audio file”; Whisper store the audio data?

I’ve just installed the Whisper plugin, created an API key and configured the plugin.

No success yet as I’m seeing error 429, presumably returned by Whisper … “Rate limit reached for requests”. I think I read that I get $18 free credit to being with, but I’m unclear whether I need to set up a billing method first, which could be the cause of this error.

Any help appreciated!

Genius actually, as I was working on this myself. probably end up using this compared to my Franken Python of a mess.

I’ll have to see maybe if I could incorporate my real time dictation into something like this, I’ll post it if I end up having something work.

Do you have GPT plus? I’ll look into this actually, as when I was using it, the only credit you were using was killing your GPU.

I’ll update if I find a work around, I just came across this, so hopefully I’ll find one.