What I’m trying to do
I’ve been using voice memos as my primary idea capture method for the past three months, then feeding them into an AI pipeline before bringing them into my knowledge base.
My pipeline starts on iPhone and ends with Claude Code on Windows:
Voice memo → transcription → AI summary → structured notes → Obsidian
This works well for me for:
-
journaling
-
investment ideas
-
side projects
-
thinking-out-loud capture
However, I keep hitting one annoying issue: interruptions split recordings.
Examples:
-
phone calls
-
getting in/out of the car (Bluetooth switching)
Instead of one clean recording, I end up with more than one voice memos.
This breaks the pipeline because I now need to manually combine them before sending to AI.
Things I’ve tried
-
manually merging on desktop using audio editors (e.g., Audacity) — works but requires exporting files, aligning clips, and re-exporting
-
using video/audio editors like CapCut — import multiple recordings, place on timeline, export as one file (works but heavy for quick capture)
These approaches technically solve the problem, but they add friction.
I’m curious
-
Are you using voice memos as input for Obsidian?
-
Do interruptions splitting recordings happen to you?
-
How are you handling it today?
-
Any recommended approach?
Would love to learn how others are doing voice-first capture.