Speech recognition plugin based on vx Chrome-extension?

jmbarkei · December 1, 2022, 11:08am

I know I am opening an old subject, but since I am neither a Mac nor a Windows User, speech recognition is lacking on my linux-system. But as I am “still” working with different printed books, speech recognition would come in handy, especially for citations. Typing in text from a Book lying on your desk is, after all, a bit too “old school” for my liking.

There is the option of using a Chrome-Extension (which I do), but copying text bit by bit from a Chrome webapp window into a obsidian note is also not my idea of comfort in the long run. Unfortunately I am a total noob (and a fool) when i comes to coding, so programming a plugin myself is out of the question.

A few days ago I stumbled upon a chrome extension that - at least in my view - has some interesting features, that would facilitate making an Obsidian plugin a lot:

It is called vx (Homepage here), and exists as chrome plugin, command line application and as an Electron App. It’s hosted in all 3 flavors on Github with an MIT License.
It uses the Google Speech API (free and paid) and copies the recognized text into the clipboard. The latter, I figure, could be easily changed to the active note.
Maybe - as I said, I have no clue how much work it might be in the end, but I could imagine this to be doable - there are some changes to be made to the code, but to me it seems that most of the work is done.
The code is now 4 years old, but since the author still seems to be active, there also might be a chance for cooperation. The app itself seems to be working really fine.

Anyone who would like to take up the task?

Thanks a lot in advance!

jasminpatel · July 10, 2024, 9:42am

I understand your frustration with the lack of built-in speech recognition for Linux in Obsidian and the inconvenience of using a Chrome extension for citations. While the vx Chrome extension seems like a promising starting point, here’s a breakdown of your options:

Possible solutions:

Community Plugins: Check the Obsidian community plugin section. There might be plugins specifically designed for Linux speech recognition that integrate with Obsidian. Search for keywords like “speech recognition” or “voice to text” within the plugin section.

External Dictation Tools: Explore standalone dictation tools for Linux. These can work independently and allow you to dictate text into your clipboard, which you can then paste into Obsidian. Look for open-source options like Gnome Speech Recognition or Dictation Anywhere.

Cloud-based Speech Recognition: Services like Google Docs or Microsoft Word offer built-in dictation features. You could dictate your citations into a cloud document and then copy-paste them into Obsidian.

Collaboration on vx plugin: If you’re comfortable reaching out to the vx developer, you could explain your situation and see if they’d be open to collaborating on a modified version for Obsidian on Linux. This would require some technical knowledge but could be a good option if the developer is receptive.

Things to Consider:

Technical knowledge: Modifying the vx code yourself might be difficult without coding experience.

Development effort: Even with the vx codebase, adapting it for Obsidian integration could be time-consuming.

Alternatives:

While not ideal, consider using the vx Chrome extension with a clipper tool like “Obsidian Web” to capture the dictated text directly into Obsidian. This would streamline the workflow compared to manually copying and pasting.

I hope this helps! Let me know if you have any other questions.