Create Perfect Summaries of Video Transcripts with AI for free

Summarize transcripts with AI for free


I started with AI transcript summaries rather late (Feb 2025). I took some of the ideas from Obsidian’s Discord channel (Obsidian Web Clipper related spaces), others from Reddit.

I have experimented with 3 types of methods…

  • Python scripts via Obsidian Interactivity plugin
  • Obsidian Web Clipper’s Interpreter method
  • Obsidian Copilot plugin

I have found various (minor to major) flaws using the first two methods, so I ended up with a fusion of the second and third method.

My chosen method

I am first outlining the problem points and how to tackle them, then I’ll show you what works for me.

  • The write-up expects the reader to have a minimal knowledge of computer science so I try not to overexplain myself (how to install Obsidian, plugins, browser extensions, or how to call up Obsidian plugin commands from the Command Palette).

Are transcripts available or not?

Chances are you will find videos of interest that have no available transcripts. This can have various causes:

  • Video was uploaded several years ago.
  • Uploader didn’t add the transcript as a subtitle so the Transcript button could emerge at Description.

Solution

Download the video with the method outlined below and upload it to your YT account so the video will be automatically transcribed by Google/YouTube. (I don’t want to use AI models to do this as I don’t have a powerful enough PC to do it fast enough. I want to use freely available Google services and resources.)

Do we need Web Clipper and for what?

The Obsidian Web Clipper’s Interpreter functionality is nice but I’ve not seen a workflow that would suit my needs.

  • First of all, I want the transcripts to be available based on the URL. I don’t want to open the Transcripts manually (if available at all).
  • I want the timestamps to be clickable (or not have to do or explain mass replacements through a script to make them clickable).
    • Edit: The 260625 post has a script now that can be used for multiple purposes.
  • I also want visual feedback of where we are (some kind of progress feedback) from the conclusion of the summary export.
    • I’ve seen cases where the Interpreter took 120-180s to create the summary, while in Copilot, this was much faster.
    • I can also keep the original transcript, while in Interpreter there is no apparent way to do this(?).

Obsidian note creation with metadata and YT video widget

What we would (or now: must) still use the Web Clipper for is to create the note in the folder we want it, with metadata we set and a widget below the frontmatter.
And: to have the URL in a source property for creating clickable timestamps with the new script I mentioned.

  • In any case, it’s not only useful now to have Web Clipper create the file but sometimes even a necessity.

Used tools

  • Video DownloadHelper browser extension
  • Obsidian Web Clipper browser extension
  • Obsidian YTranscript plugin
  • Obsidian Copilot plugin

Optional

  • Obsidian Hover Editor plugin
  • Obsidian Surfing plugin
  • Obsidian CSS Editor plugin

Setup

(…and workflow – the setup steps are longer than what you need to do as your workflow)

1

We will use Google Gemini’s free Flash Thinking experimental models. No need to think OpenAI, Groq, Cerebras or other providers’ models here. If anyone wants to use local LLMs it’s up to them.
Type in Google search: google gemini free api key
Get a Gemini API key in Google AI Studio
Sign in to your Google account if not signed in.
At https://aistudio.google.com/app/apikey, Create API key. Copy it. Keep it safe.

Head over to Google AI Studio’s https://aistudio.google.com/app/prompts/new_chat page.
Over on the right, look up the model’s name: gemini-2.0-flash-thinking-exp-01-21

  • An older gemini-2.0-flash-thinking-exp-1219 model is still valid and useable.
  • In the future, you’ll need to come back to this page to look for any possible newer models and update your model settings in Copilot.

2

In Obsidian Copilot plugin’s settings, add the key:
Basic → General → Set keys
Add your key in the Gemini bar and verify it.

Model → Add Custom Model
Add both models mentioned above.

Leave Base URL as it is, tick ‘Reasoning’, then Verify model and Add Model (provider is Gemini, which you need to set too).
Same for the 1219 model as well.

  • Notice there is no hyphen in the 1219 model name, but there is one in the 01-21 one.
    We add the older model in case the newer model is overworked by users worldwide and we are thwarted in our attempts to use the latest model.
  • By the way, a gemini-2.0-flash-thinking-exp-latest naming convention does not exist, but upon thinking gemini-2.0-flash-thinking-exp like so would be the latest model, so I guess we could take this on as a third (or even as a sole candidate) for the go-to model version. Then we wouldn’t need to go to Google to check time and again.

Add other settings:

The 0.7 temperature setting may be too high (too creative) for some. In that case, try to explore summary outputs with 0.1 or maybe in between the two.

Because the models mentioned are able to output 64000 tokens and you cannot set any value higher than 16000, do the following:
Close Obsidian, head to the Copilot plugin’s settings in .obsidian\plugins\copilot in your Explorer/Finder and open data.json with a text editor.
Find "maxTokens": 16000, line and exchange the number with 64000.
Save the json file. You can open Obsidian again.

Go back to the settings of Copilot. Don’t touch the 64000 setting.
Add your own folders where you want to keep your custom prompts and where you want to save conversations:

Create an empty md file and add this content as your Copilot AI Summarizer prompt:

Summarize {activeNote} in Markdown format. If the transcript is in English or any language except German, summarize in German.  
If it is in German, summarize in German. Fix obvious spelling errors automatically. Use appropriate H1 to H4 headers.

Insert callout boxes where relevant to expand context. Use the following types exactly as described (both type and content of box must be preceded with `> ` as you can see) and **always** keeping empty lines between these elements:

> [!check]
> For key insights or useful information.

> [!important]
> For critical points or must-remember details.

> [!fail]
> For warnings, dangers, or failures.

> [!note]
> For notable quotes or poetic/philosophical reflections.

> [!question]
> For thought-provoking or discussion-worthy questions.

> [!example]
> For extended exploration, implications, and creative ideas.

Distinguish **question** and **example** boxes: Question relates closely to the topic, while Example allows deeper, more philosophical or spiritual extensions.

### Formatting & Processing Rules:
- Ensure logical flow.
- Avoid redundant introductions and outros.
- Ignore irrelevant sections (disclaimers, social media plugs, intros/outros without subject relevance).
- Must preserve timestamps for reference.
- Ensure proper Markdown rendering: Every callout box **must** be separated from other content and other callouts by exactly ONE blank line, both before and after.  
- Summary length should match content depth – expand when meaningful, condense when generic.

If the input transcript is missing definite articles or seemingly starts the sentences with lowercase characters, **fix it automatically.**  

### FINAL SECTIONS:
At the end, create these structured bullet-point sections:

**IDEAS** - 1-10 distilled key insights.  
**QUOTES** - 1-10 impactful direct quotes. Here, use double quotes, not backticks!
**TIPS** - 1-10 practical takeaways.  
**REFERENCES** - List of relevant books, tools, projects, and inspirations.

### OUTPUT INSTRUCTIONS:
- Use only Markdown and consistently.
- Ensure proper sentence capitalization (uppercase letters **must** start a sentence!).
- Output only the requested sections - no additional notes or warnings.

- Maintain efficiency and consistency throughout. Use timestamps with exact clickable format as they appeared in the text so each notion has its easily clicked reference.  
- Do NOT use escape characters before double quotes. In fact, when providing German summaries, use smart quotes.

The {activeNote} variable is used in the prompt, which means you need to have the note with the transcript added to it open.
I expect users to come from a bilingual background. I was using German as the other language. Find German in the md file and replace with your own language. If not needed, remove those references to another language.

3

Install the Obsidian Web Clipper extension in your browser(s).

  • I use Firefox everywhere and sync my extensions, so I need to install extensions once and on a new install I get the extensions with data on sync. I expect it works the same way on Chrome and Edge.

Add this template by Importing it.
Import template → Paste content below:

{
	"schemaVersion": "0.1.0",
	"name": "YouTube Transcript Summarizer Metadata and Header Widget Creator",
	"behavior": "create",
	"noteContentFormat": "## AI Summary\n\nAuthor: {{author}}\n<iframe height=\"315\" class=\"w-full rounded-btn\" src=\"https://www.youtube-nocookie.com/embed/{{url|replace:\"/.*[?&]v=([a-zA-Z0-9_-]{11}).*|.*youtu\\.be\\/([a-zA-Z0-9_-]{11}).*|.*\\/embed\\/([a-zA-Z0-9_-]{11}).*|.*\\/v\\/([a-zA-Z0-9_-]{11}).*/\":\"$1$2$3$4\"}}?enablejsapi=1\" title=\"{{title|replace:\"/[\\\\/\\:*?\\\"<>\\|]/g\":\"-\"}}\" frameborder=\"0\" allow=\"clipboard-write; encrypted-media; picture-in-picture ;web-share\" allowfullscreen=\"\" id=\"widget2\"></iframe>\n\n",
	"properties": [
		{
			"name": "title",
			"value": "{{title|replace:\\\"/[\\\\/\\:*?\\\\\"<>\\|]/g\\\":\\\"-\\\"}}",
			"type": "text"
		},
		{
			"name": "source",
			"value": "{{url|replace:\\\"/([?&]list=[^&]*)|([?&]index=[^&]*)|([?&]t=[^&]*)|([?&]start=[^&]*)|([?&]feature=[^&]*)|([?&]ab_channel=[^&]*)/g\\\":\\\"\\\"}}",
			"type": "text"
		},
		{
			"name": "description",
			"value": "{{description}}",
			"type": "text"
		},
		{
			"name": "author",
			"value": "{{author|split:\\\", \\\"|wikilink|join}}",
			"type": "multitext"
		},
		{
			"name": "tags",
			"value": "webclip transcript-summary",
			"type": "multitext"
		},
		{
			"name": "published",
			"value": "{{published}}",
			"type": "date"
		},
		{
			"name": "created",
			"value": "{{date}}",
			"type": "datetime"
		}
	],
	"triggers": [],
	"noteNameFormat": "{{title|replace:\"/[\\\\/\\:*?\\\"<>\\|]/g\":\"-\"}}",
	"path": "Add your path here"
}

Once you imported it, rename the template if you want and of course, Add your path here will not be your path in your vault where you want your note saved, so edit it. You can edit the tags values to any values of your choice. If your created property is date created or anything else, you change that too.

  • The sanitizer regex for 3 title replacements was updated on 24-02-2025 to include ‘g’ flags.

4

Pick the video you want and run the clipper with the above template.
You’ll have your note with the metadata and widget of the vid in your (last used) vault (which is the default setting in Web Clipper).

5

Go back to the video you wanted, copy the URL and try adding it to the YTranscript plugin. If it throws an error, the transcript is not (yet) available.
You’ll need to download the video and upload it to your account.

6

Download the vid with the DownloadHelper. If it’s a YT video, the extension asks you to install some helper app. Install it and in the settings add your own preferred download location.
If the video you are downloading is from YouTube, it’s okay to download the video at 360px, so the download and upload time will take less.

Upload the vid to your own YT account. Set the visibility to Unlisted. YT will still transcribe the video at this setting.
In a short while, the checks will have been done and the transcription added.

7

Try your own URL (the new YT URL) in YTranscript again.
If the transcription process have been done, you’ll have the timestamped lines appear in the sidebar. Below the Search bar, right click on the text → Copy all.
Paste the clipboard contents into the note we created with Web Clipper. You can close the YTranscript sidebar element. We don’t need it anymore.

8

Command Palette: Open Copilot Chat Window

On the bottom of the sidebar element, set the newly added model as the one we want to use:
image

In the panel where you’d add your input, press / to add your custom prompt you saved above:

Upon having the prompt added, you need to send it with Enter. Then in a short time, the summary is being streamed line by line into the sidebar element pane.
In the meantime, add two empty lines or a separator (---) below the transcript, which you can choose to keep or not (I keep it).
When the streaming has finished, at the bottom of the response, you can click Insert to note as cursor:
image

  • The icons appear on hover only, so sometimes it is difficult to find them.

9

If all has been done well, you’ll have your note populated with the intended content.
Here we can make use of optional plugins, so when we hover over a timestamp, we can listen in on the topics.
We need to install and enable:

  • Hover Editor
  • Surfing
  • CSS Editor

In order for what we want to work, the Page Preview core plugin needs to be enabled.

With the CSS Editor plugin, you’ll need to add the following CSS snippet:

.theme-light, .theme-dark {
    –popover-width: 1432px;
    –popover-height: 680px;
}
  • The width value will need to be adjusted to the right edge of the left sidebar, if that’s what is needed.
  • The height value seems to make no difference.
  • A simple .theme selector rule didn’t work, hence .theme-light, .theme-dark { is used in the first line.

When you create the Hover Popup Width.css file (or you can name this with any other name) with the plugin (you need to add the .css extension), the plugin automatically enables the snippet.

Tips

1

The steps above don’t need to be done in that order. Ideally, you need to explore what videos you are interested in first. You can use the YTranscript plugin to check whether the video has available transcript(s).
In case you need transcripts, download all videos and upload them to your YouTube to allow plenty of time for the transcribing jobs to be finished.

2

When juggling settings in Copilot, try adding credentials to or – if you have no credit cards or intention to use this – remove the default OpenAI provider as the plugin notices in the upper right-hand corner can be annoying after a while.

User feedback

Please provide feedback how you’d make the above work faster or more intuitively, especially with the Interpreter, for video summaries with clickable time stamps, videos over the length of 59:59 mins:secs, etc.

Notes

1

I dared to title this ‘Perfect Summaries’.
The model does a really good job cleaning up transcription problems.
For example, in my original text I had ‘albert cu and jean-paul sart’. The model in the summary rectified the names with ‘Albert Camus’ and ‘Jean-Paul Sartre’.
Out of 10 names it got 8-9 spelling right even when they were in my native language.

2

In the last few days I’ve seen problems with incomplete summaries. No errors but only a few lines were output for all 3 models mentioned above.
I went back to my Python script, which worked but as I said above, with that, the output is somehow never as good there as with Copilot.

3

Some widgets will come out black with the video being unavailable in your region. The videos may still play though and in thast case the playing of videos through the timestamps will still be possible.

4

I have fed it some topics to do with fringe science. It looks as if Copilot inherently deals with uncensoring output…

  • The equivalent of…
{
  "safetySettings": [
    {
      "category": "HARM_CATEGORY_HARASSMENT",
      "threshold": "BLOCK_NONE"
    },
    {
      "category": "HARM_CATEGORY_HATE_SPEECH",
      "threshold": "BLOCK_NONE"
    },
    {
      "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
      "threshold": "BLOCK_NONE"
    },
    {
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_NONE"
    },
    {
      "category": "HARM_CATEGORY_UNSPECIFIED",
      "threshold": "BLOCK_NONE"
    }
  ]
}

…so the output is streamed without hindrance, but I have not tested the models with the most outrageous topics yet, of course.
I am mainly alluding to the fact Google censored me on occasions where I simply wanted to translate something and did not do it, but tried to boss over what I should know about things.

5

260625:
Updated post to draw attention to this script we can use.
There are other scripts as well to download (Python).

8 Likes

24-02-2025

I updated the WebClipper template with regex ‘g’ flags and added support to handle URLs in various playlist formats.

Please update your clipper template if you followed the guide before this date.

1 Like

Tip:
If the model is overloaded and you keep getting cut in output, just prompt it with:
“Continue from this part/heading.”

It seems its memory was not purged and it can continue flawlessly.

Also:
Thinking models give much better answers than non-thinking models but can be as many as 10 times slower…?

  • Depends. Sometimes Thinking is very fast, so it’s mostly down to Google.

Update 270525:
I keep having problems not having the uploaded vids’ transcripts available.
I need to go into YT Studio, and under the uploaded item, go into Subtitles and Publish the transcribed item. Annoying…

Step 6: Download and upload video; this is painful. I hope someone can find the way to automate it.

Sometimes I cannot get the plugin to load my transcriptions at all…

I quickly got a Python script written to bypass problems in Obsidian.
For this to work, you need to Publish the auto-generated subtitles in your YT Studio like you’d have to do it for Obsidian WebClipper Interpreter kind of workflows.
Because not everyone is comfortable with Python, I had AI generate a guide to go with the script, the generation of which I think is pretty standard by 2025 if you want to get things done on your PC. Claude 4.0 is recommended for python, javascript, Obsidian typescript scripts.

Add the link and the Download folder and press the bar button to generate md file and copy it to your Obsidian md file.

(Note: Alternative installation methods should be skipped for first time users of Python.)

import subprocess
import re
import os
import platform
from pathlib import Path
import tkinter as tk
from tkinter import ttk, messagebox, filedialog
import threading

# ==================== CONFIGURATION ====================
CONFIG = {
    # Languages to try in order of preference (stops at first available)
    # Common codes: hu=Hungarian, en=English, de=German, fr=French, es=Spanish, 
    # it=Italian, pt=Portuguese, ru=Russian, ja=Japanese, ko=Korean, zh=Chinese
    # Recommendation: Put 2-4 languages max for faster processing
    "languages": ["hu", "en", "de", "fr", "es"],
    
    "subtitle_format": "vtt",  # or "srt"
    "output_filename_format": "{video_id}_{lang}.md"  # Customize output filename
}

# Set root directory based on platform
if platform.system() == "Windows":
    BASE_PATH = r'C:\Users\YourName\Downloads'
elif platform.system() == "Linux":
    BASE_PATH = '/home/YourName/Downloads'
else:
    # macOS
    BASE_PATH = '/Users/YourName/Downloads'

def check_yt_dlp():
    """Check if yt-dlp is installed and accessible."""
    try:
        result = subprocess.run(["yt-dlp", "--version"], 
                              capture_output=True, text=True, check=True)
        return True, result.stdout.strip()
    except (subprocess.CalledProcessError, FileNotFoundError):
        return False, None

def extract_video_id(url):
    """Extract video ID from YouTube URL."""
    match = re.search(r"(?:v=|youtu\.be/|embed/|watch\?v=)([0-9A-Za-z_-]{11})", url)
    return match.group(1) if match else None

def get_video_title(video_url):
    """Get video title for better file naming."""
    try:
        result = subprocess.run([
            "yt-dlp",
            "--get-title",
            video_url
        ], capture_output=True, text=True, check=True)
        return result.stdout.strip()
    except subprocess.CalledProcessError:
        return "Unknown_Title"

def download_and_parse_subs(video_url, output_path, progress_callback=None):
    """Download and parse subtitles from YouTube video."""
    output_dir = Path(output_path)
    output_template = str(output_dir / "%(id)s.%(ext)s")
    video_id = extract_video_id(video_url)
    
    if not video_id:
        raise ValueError("Invalid YouTube URL")

    if progress_callback:
        progress_callback("Getting video information...")
    
    video_title = get_video_title(video_url)

    for i, lang in enumerate(CONFIG["languages"]):
        if progress_callback:
            progress_callback(f"Trying language: {lang.upper()} ({i+1}/{len(CONFIG['languages'])})")
        
        try:
            # Use shell=True on Windows for better compatibility
            shell_flag = platform.system() == "Windows"
            
            subprocess.run([
                "yt-dlp",
                "--write-sub",
                "--sub-lang", lang,
                "--skip-download",
                "--sub-format", CONFIG["subtitle_format"],
                "-o", output_template,
                video_url
            ], check=True, capture_output=True, text=True, shell=shell_flag)

            subtitle_file = output_dir / f"{video_id}.{lang}.{CONFIG['subtitle_format']}"
            if not subtitle_file.exists():
                continue

            if progress_callback:
                progress_callback(f"Processing {lang.upper()} subtitles...")

            with open(subtitle_file, encoding="utf-8") as f:
                content = f.read()

            subtitle_file.unlink()  # Delete subtitle file after reading

            lines = []
            # Handle both VTT and SRT formats
            if CONFIG["subtitle_format"] == "vtt":
                blocks = re.findall(
                    r"(\d{2}:\d{2}:\d{2}\.\d{3}) --> .*?\n(.*?)\n\n", content, re.DOTALL)
            else:  # SRT format
                blocks = re.findall(
                    r"\d+\n(\d{2}:\d{2}:\d{2},\d{3}) --> .*?\n(.*?)\n\n", content, re.DOTALL)
            
            for start_time, text in blocks:
                # Handle both comma and dot decimal separators
                time_str = start_time.replace(',', '.')
                h, m, s = map(float, time_str.split(":"))
                total_seconds = int(h * 3600 + m * 60 + s)
                ts_str = f"{int(h):02}:{int(m):02}:{int(s):02}"
                yt_url = f"https://www.youtube.com/watch?v={video_id}&t={total_seconds}"
                clean_text = text.strip().replace('\n', ' ')
                # Remove HTML tags and formatting
                clean_text = re.sub(r'<[^>]+>', '', clean_text)
                lines.append(f"[{ts_str}]({yt_url}) {clean_text}")

            # Add video title and metadata at the top
            header = f"# {video_title}\n\n"
            header += f"**Video ID:** {video_id}  \n"
            header += f"**Language:** {lang.upper()}  \n"
            header += f"**URL:** [Watch on YouTube]({video_url})  \n\n"
            header += "---\n\n"
            
            transcript = header + "\n".join(lines)
            return transcript, video_id, lang, video_title
            
        except subprocess.CalledProcessError:
            continue

    available_langs = ", ".join(CONFIG["languages"])
    raise RuntimeError(f"No subtitles found in any of the configured languages: {available_langs}")

def save_transcript(text, video_id, lang, video_title, output_path):
    """Save transcript to markdown file."""
    # Clean video title for filename
    clean_title = re.sub(r'[^\w\s-]', '', video_title).strip()
    clean_title = re.sub(r'[-\s]+', '_', clean_title)[:50]  # Limit length
    
    filename = CONFIG["output_filename_format"].format(
        video_id=video_id,
        lang=lang,
        title=clean_title
    )
    
    path = Path(output_path) / filename
    with open(path, "w", encoding="utf-8") as f:
        f.write(text)
    return path

def browse_folder():
    """Open folder browser dialog."""
    folder = filedialog.askdirectory(initialdir=str(BASE_PATH))
    if folder:
        path_var.set(folder)

def update_progress(message):
    """Update progress label."""
    status_label.config(text=message)
    root.update_idletasks()

def on_submit():
    """Handle submit button click."""
    url = url_entry.get().strip()
    output_path = path_var.get().strip()
    
    if not url:
        messagebox.showerror("Input Error", "Please enter a YouTube URL.")
        return
    
    if not output_path:
        messagebox.showerror("Input Error", "Please select an output folder.")
        return

    def run():
        submit_button.config(state="disabled")
        progress_bar.start()
        
        try:
            transcript, video_id, lang, video_title = download_and_parse_subs(
                url, output_path, progress_callback=update_progress
            )
            update_progress("Saving transcript...")
            path = save_transcript(transcript, video_id, lang, video_title, output_path)
            update_progress("Success!")
            messagebox.showinfo("Success", 
                f"Transcript saved successfully!\n\n"
                f"File: {path.name}\n"
                f"Language: {lang.upper()}\n"
                f"Location: {path}")
        except Exception as e:
            update_progress("Error occurred")
            messagebox.showerror("Error", f"Failed to extract subtitles:\n\n{str(e)}")
        finally:
            submit_button.config(state="normal")
            progress_bar.stop()
            update_progress("Ready")

    threading.Thread(target=run, daemon=True).start()

# ==================== UI ====================
root = tk.Tk()
root.title("YouTube Subtitle Extractor")
root.geometry("700x400")
root.resizable(True, False)

# Check if yt-dlp is available
yt_dlp_available, version = check_yt_dlp()

main_frame = ttk.Frame(root, padding=20)
main_frame.pack(fill=tk.BOTH, expand=True)

# Title
title_label = ttk.Label(main_frame, text="YouTube Subtitle Extractor", 
                       font=("TkDefaultFont", 16, "bold"))
title_label.pack(pady=(0, 15))

# Status frame
status_frame = ttk.Frame(main_frame)
status_frame.pack(fill=tk.X, pady=(0, 15))

if yt_dlp_available:
    ttk.Label(status_frame, text=f"✓ yt-dlp found (version: {version})", 
              foreground="green").pack(anchor="w")
else:
    ttk.Label(status_frame, text="⚠ yt-dlp not found. Please install yt-dlp first.", 
              foreground="red").pack(anchor="w")

# Configuration display
config_frame = ttk.LabelFrame(main_frame, text="Configuration", padding=10)
config_frame.pack(fill=tk.X, pady=(0, 15))

langs_text = "Languages (in order of preference): " + " → ".join(CONFIG["languages"])
ttk.Label(config_frame, text=langs_text).pack(anchor="w")
ttk.Label(config_frame, text=f"Subtitle format: {CONFIG['subtitle_format'].upper()}").pack(anchor="w")

# URL input
url_frame = ttk.LabelFrame(main_frame, text="YouTube URL", padding=10)
url_frame.pack(fill=tk.X, pady=(0, 10))
url_entry = ttk.Entry(url_frame, font=("TkDefaultFont", 10))
url_entry.pack(fill=tk.X)

# Output path selection
path_frame = ttk.LabelFrame(main_frame, text="Output Folder", padding=10)
path_frame.pack(fill=tk.X, pady=(0, 15))

path_input_frame = ttk.Frame(path_frame)
path_input_frame.pack(fill=tk.X)

path_var = tk.StringVar(value=str(BASE_PATH))
path_entry = ttk.Entry(path_input_frame, textvariable=path_var)
path_entry.pack(side=tk.LEFT, fill=tk.X, expand=True)

browse_button = ttk.Button(path_input_frame, text="Browse", command=browse_folder)
browse_button.pack(side=tk.RIGHT, padx=(10, 0))

# Submit button
submit_button = ttk.Button(main_frame, text="Extract Subtitles", command=on_submit)
submit_button.pack(pady=15)

if not yt_dlp_available:
    submit_button.config(state="disabled")

# Progress bar
progress_bar = ttk.Progressbar(main_frame, mode='indeterminate')
progress_bar.pack(fill=tk.X, pady=(0, 5))

# Status label
status_label = ttk.Label(main_frame, text="Ready")
status_label.pack()

# Instructions
instructions_frame = ttk.LabelFrame(main_frame, text="Quick Instructions", padding=10)
instructions_frame.pack(fill=tk.X, pady=(15, 0))

instructions_text = """1. Make sure yt-dlp is installed (see guide for installation)
2. Paste a YouTube video URL
3. Choose where to save the markdown file
4. Click 'Extract Subtitles'
5. Import the generated .md file into Obsidian"""

ttk.Label(instructions_frame, text=instructions_text, justify=tk.LEFT).pack(anchor="w")

root.mainloop()

YouTube Subtitle Extractor for Obsidian - Complete Setup Guide

Extract YouTube video subtitles as timestamped markdown files perfect for importing into Obsidian. This tool creates clickable timestamps that link back to specific moments in the video.

What This Tool Does

  • Downloads subtitles from YouTube videos in multiple languages (configurable)
  • Converts them to markdown format with clickable timestamps
  • Creates files ready to import into Obsidian with video metadata
  • Works cross-platform: Windows, macOS, and Linux
  • Tries languages in order of preference (Hungarian → English → German → French → Spanish by default)

Prerequisites

1. Install Python

Windows

  1. Go to python.org
  2. Download Python 3.8 or newer
  3. Important: During installation, check “Add Python to PATH”
  4. Complete the installation
  5. Verify: Open Command Prompt and type python --version

macOS

Option 1: Official Installer (Recommended)

  1. Go to python.org
  2. Download Python 3.8 or newer for macOS
  3. Install the downloaded package
  4. Verify: Open Terminal and type python3 --version

Option 2: Using Homebrew

brew install python

Linux (Ubuntu/Debian)

sudo apt update
sudo apt install python3 python3-pip python3-tkinter

Linux (CentOS/RHEL/Fedora)

# CentOS/RHEL 8+
sudo dnf install python3 python3-pip python3-tkinter

# Fedora
sudo dnf install python3 python3-pip python3-tkinter

# Older CentOS/RHEL
sudo yum install python3 python3-pip tkinter

2. Install yt-dlp

After Python is installed, install yt-dlp:

Windows

Open Command Prompt or PowerShell as Administrator and run:

pip install yt-dlp

macOS/Linux

Open Terminal and run:

pip3 install yt-dlp

Alternative installation methods:

  • Using pipx (isolated installation): pipx install yt-dlp
  • Using conda: conda install -c conda-forge yt-dlp
  • Direct download: Download from yt-dlp releases

Installation & Setup

Step 1: Download the Script

  1. Copy the Python script from above
  2. Save it as youtube_subtitle_extractor.py
  3. Save it in an easily accessible location (Desktop, Documents, etc.)

Step 2: Verify Installation

Open your terminal/command prompt and check:

Windows:

python --version
yt-dlp --version

macOS/Linux:

python3 --version
yt-dlp --version

Both commands should return version numbers without errors.

Step 3: Configure Languages & Paths

Language Configuration

Edit the CONFIG section at the top of the script:

CONFIG = {
    "languages": ["hu", "en", "de", "fr", "es"],  # Modify this list
    "subtitle_format": "vtt",  # or "srt"
    "output_filename_format": "{video_id}_{lang}.md"
}

Important notes about language configuration:

  • Order matters: The script tries languages left-to-right and stops at the first available
  • Keep it short: Recommend 2-4 languages max for faster processing
  • Put preferred first: Your most wanted language should be leftmost
  • Example: ["hu", "en"] tries Hungarian first, then English as fallback

Common language codes:

  • hu - Hungarian
  • en - English
  • de - German
  • fr - French
  • es - Spanish
  • it - Italian
  • pt - Portuguese
  • ru - Russian
  • ja - Japanese
  • ko - Korean
  • zh - Chinese

Path Configuration

The script uses these default paths (change if needed):

# Windows
BASE_PATH = r'C:\Users\YourName\Downloads'

# Linux  
BASE_PATH = '/home/YourName/Downloads'

# macOS
BASE_PATH = '/Users/YourName/Downloads'

Note: Replace YourName with your actual username, or the app will let you browse to any folder anyway.

Step 4: Run the Application

Windows

  1. Open Command Prompt or PowerShell
  2. Navigate to your script location:
    cd "C:\Users\YourName\Desktop"
    
  3. Run the script:
    python youtube_subtitle_extractor.py
    

macOS/Linux

  1. Open Terminal
  2. Navigate to your script location:
    cd ~/Desktop
    
  3. Run the script:
    python3 youtube_subtitle_extractor.py
    

Alternative: Double-click method

  • On Windows: If Python is properly installed, you might be able to double-click the .py file
  • On macOS/Linux: You may need to make it executable first: chmod +x youtube_subtitle_extractor.py

How to Use

Basic Usage

  1. Launch the application - A GUI window will open
  2. Check status - Green checkmark means yt-dlp is properly installed
  3. Paste YouTube URL - Any YouTube video URL works
  4. Select output folder - Choose where to save markdown files (defaults to Downloads)
  5. Click “Extract Subtitles” - The tool will:
    • Try each configured language in order
    • Download the first available subtitle track
    • Convert to markdown with clickable timestamps
    • Save with video metadata

Understanding the Output

The generated markdown file includes:

# Video Title Here

**Video ID:** dQw4w9WgXcQ  
**Language:** EN  
**URL:** [Watch on YouTube](https://www.youtube.com/watch?v=dQw4w9WgXcQ)  

---

[00:00:15](https://www.youtube.com/watch?v=dQw4w9WgXcQ&t=15) Welcome to this video tutorial
[00:00:32](https://www.youtube.com/watch?v=dQw4w9WgXcQ&t=32) Today we'll be discussing advanced techniques
[00:01:05](https://www.youtube.com/watch?v=dQw4w9WgXcQ&t=65) Let's start with the first concept

Using in Obsidian

  1. Import the file: Copy or move the .md file to your Obsidian vault
  2. Click timestamps: Each timestamp is a clickable link that opens YouTube at that exact moment
  3. Edit freely: Add your own notes, highlights, or structure around the transcript
  4. Link to other notes: Use Obsidian’s linking features to connect concepts

Troubleshooting

Common Issues

“yt-dlp not found”

  • Reinstall yt-dlp: pip install --upgrade yt-dlp
  • Check PATH: Make sure Python Scripts directory is in your system PATH
  • Try alternative installation: python -m pip install yt-dlp

“No subtitles found”

  • Video may not have subtitles in configured languages
  • Try auto-generated subtitles by modifying the script
  • Check if the video has any captions available on YouTube

Permission errors on Windows

  • Run Command Prompt as Administrator
  • Choose a different output folder (not system directories)

GUI doesn’t appear

  • Install tkinter: pip install tk (usually included with Python)
  • On Linux: sudo apt install python3-tkinter

Script won’t run

  • Check Python installation: python --version
  • Try python3 instead of python on macOS/Linux
  • Make sure you saved the script with .py extension

Advanced Configuration

Custom subtitle formats:

CONFIG = {
    "subtitle_format": "srt",  # Change from "vtt" to "srt"
}

Auto-generated subtitles:
Add --write-auto-sub to the yt-dlp command in the script for videos without manual captions.

Multiple languages:
The script automatically tries all configured languages in order until it finds available subtitles.

Tips for Obsidian Users

  1. Create a dedicated folder for YouTube transcripts in your vault
  2. Use tags like #youtube #transcript #video-notes for organization
  3. Link concepts to your existing notes using [[]] syntax
  4. Add your insights directly in the transcript with highlights or comments
  5. Create templates for consistent formatting across video notes

System Requirements

  • Python: 3.8 or newer
  • Operating System: Windows 10+, macOS 10.14+, or modern Linux
  • Internet: Required for downloading subtitles
  • Disk Space: Minimal (markdown files are very small)
  • Memory: Low usage, handles long videos efficiently

Support

If you encounter issues:

  1. Check Python/yt-dlp versions - Update both if needed
  2. Try different videos - Some videos may lack subtitles
  3. Check internet connection - Required for downloading
  4. Verify file permissions - Ensure write access to output folder

This tool makes it easy to capture YouTube knowledge directly into your Obsidian workflow with clickable, timestamped references!

As far as I know, there is no way to upload to YouTube without you going into Studio and manually upload file, and set the things you need to set when you do the uploads.

And I don’t know of another free service that does transcriptions of even 3-4 hr videos…

Sometimes the videos you download are in parts and when you want to upload to your own YT, the parts are not the same resolution.
But it is not easy to set resolution in JD2 and Video DownloadHelper adds a rather large QR code in the free version.

So I decided to have the former Python script extended with video download functionality (YouTube only) where you set the resolution you want.

Going with 1080 means you get that if it’s available or best available, anything lower than that, e.g. 360, you get that if it’s available, if not, any best lower res version.

import subprocess
import re
import os
import platform
from pathlib import Path
import tkinter as tk
from tkinter import ttk, messagebox, filedialog
import threading
import sys

# ==================== CONFIGURATION ====================
CONFIG = {
    # Languages to try in order of preference (stops at first available)
    # Common codes: hu=Hungarian, en=English, de=German, fr=French, es=Spanish, 
    # it=Italian, pt=Portuguese, ru=Russian, ja=Japanese, ko=Korean, zh=Chinese
    # Recommendation: Put 2-4 languages max for faster processing
    "languages": ["en", "de", "fr", "es"],
    
    "subtitle_format": "vtt",  # or "srt"
    "output_filename_format": "{video_id}_{lang}.md",  # Customize output filename
    
    # Video download settings
    "max_resolution": "1080",  # 1080, 720, 480, 360
    "video_format": "mp4",     # mp4, webm, mkv
    "audio_quality": "best"    # best, 192, 128, 64 (kbps)
}

# Set root directory based on platform using pathlib
if platform.system() == "Windows":
    BASE_PATH = Path.home() / "Downloads"
elif platform.system() == "Linux":
    BASE_PATH = Path.home() / "Downloads"
else:
    BASE_PATH = Path.home() / "Downloads"  # Default for macOS and others

# Convert BASE_PATH to string when needed for subprocess calls
BASE_PATH = str(BASE_PATH)

def check_yt_dlp():
    """Check if yt-dlp is installed and accessible."""
    try:
        result = subprocess.run(["yt-dlp", "--version"], 
                              capture_output=True, text=True, check=True)
        return True, result.stdout.strip()
    except (subprocess.CalledProcessError, FileNotFoundError):
        return False, None

def extract_video_id(url):
    """Extract video ID from various YouTube URL formats."""
    # Remove any playlist parameters first
    url = re.sub(r'[&?]list=[^&]*', '', url)
    
    # Improved YouTube URL patterns - more comprehensive and ordered by likelihood
    patterns = [
        r'(?:youtube\.com/watch\?.*v=|youtu\.be/)([0-9A-Za-z_-]{11})',  # Main patterns combined
        r'youtube\.com/embed/([0-9A-Za-z_-]{11})',                     # Embed URLs
        r'youtube\.com/v/([0-9A-Za-z_-]{11})',                         # Old /v/ format
        r'youtube\.com/shorts/([0-9A-Za-z_-]{11})',                    # YouTube Shorts
        r'youtube\.com/live/([0-9A-Za-z_-]{11})',                      # Live streams
        r'(?:m\.youtube\.com/watch\?.*v=)([0-9A-Za-z_-]{11})',         # Mobile URLs
        r'(?:gaming\.youtube\.com/watch\?.*v=)([0-9A-Za-z_-]{11})',    # Gaming URLs
    ]
    
    for pattern in patterns:
        match = re.search(pattern, url)
        if match:
            video_id = match.group(1)
            # Validate video ID format (11 characters, alphanumeric + underscore + hyphen)
            if re.match(r'^[0-9A-Za-z_-]{11}$', video_id):
                return video_id
    
    return None

def get_video_title(video_url):
    """Get video title for better file naming."""
    try:
        result = subprocess.run([
            "yt-dlp",
            "--get-title",
            video_url
        ], capture_output=True, text=True, check=True)
        return result.stdout.strip()
    except subprocess.CalledProcessError:
        return "Unknown_Title"

def get_video_info(video_url):
    """Get video information including available formats."""
    try:
        result = subprocess.run([
            "yt-dlp",
            "--list-formats",
            video_url
        ], capture_output=True, text=True, check=True)
        return result.stdout
    except subprocess.CalledProcessError:
        return None

def clean_filename(title, max_length=40):
    """Clean filename with improved character handling."""
    # Remove or replace problematic characters
    title = re.sub(r'[<>:"/\\|?*]', '', title)  # Windows forbidden chars
    title = re.sub(r'[^\w\s\-_.()]', '', title)  # Keep only safe chars including parentheses
    title = re.sub(r'\s+', '_', title.strip())  # Replace whitespace with underscores
    title = re.sub(r'_{2,}', '_', title)  # Replace multiple underscores with single
    title = title.strip('_.-')  # Remove leading/trailing separators    
    # Ensure we have a valid title
    if not title or len(title) < 3:
        return "video"
    
    return title[:max_length]

def download_video(video_url, output_path, max_res, progress_callback=None):
    """Download video from YouTube with improved progress messages."""
    output_dir = Path(output_path)
    video_id = extract_video_id(video_url)
    
    if not video_id:
        raise ValueError(f"Could not extract video ID from URL: {video_url}\nSupported formats: youtube.com/watch?v=, youtu.be/, shorts/, playlists, etc.")
    
    if progress_callback:
        progress_callback("Getting video information...")
    
    video_title = get_video_title(video_url)
    clean_title = clean_filename(video_title)
    
    output_template = str(output_dir / f"{clean_title}_{video_id}.%(ext)s")
    
    if progress_callback:
        progress_callback(f"Downloading video (target: {max_res}p max)...")
    
    try:
        shell_flag = platform.system() == "Windows"
        # --- Fix: Quote URL on Windows to prevent & splitting ---
        if shell_flag:
            video_url_arg = f'"{video_url}"'
        else:
            video_url_arg = video_url
        
        # Use a single, strict selector: never higher than max_res, but will get lower if needed
        format_selector = f"bestvideo[height<={max_res}]+bestaudio/best[height<={max_res}]/best[height<={max_res}]/best"
        
        cmd = [
            "yt-dlp",
            "-f", format_selector,
            "--merge-output-format", CONFIG["video_format"],
            "-o", output_template,
            "--no-playlist",
            video_url_arg
        ]
        
        result = subprocess.run(cmd, check=True, capture_output=True, text=True, shell=shell_flag)
        
        # Find the downloaded file with more flexible matching
        possible_extensions = ['.mp4', '.webm', '.mkv', '.avi', '.mov', '.flv']
        for ext in possible_extensions:
            potential_file = output_dir / f"{clean_title}_{video_id}{ext}"
            if potential_file.exists():
                return potential_file, video_title
        
        # Also check for files that might have slightly different names
        for file in output_dir.glob(f"*{video_id}*"):
            if file.suffix.lower() in possible_extensions:
                return file, video_title
        
        # If we get here, try one more search for any recently created video files
        import time
        current_time = time.time()
        for file in output_dir.iterdir():
            if (file.is_file() and 
                file.suffix.lower() in ['.mp4', '.webm', '.mkv', '.avi', '.mov', '.flv'] and
                abs(file.stat().st_mtime - current_time) < 300):  # Created in last 5 minutes
                if video_id in file.name or clean_title in file.name:
                    return file, video_title
        
        # If still not found, raise an error with more details
        error_msg = f"Download may have succeeded but file not found.\n"
        error_msg += f"Expected pattern: {clean_title}_{video_id}.*\n"
        error_msg += f"Search directory: {output_dir}\n"
        raise RuntimeError(error_msg)
        
    except subprocess.CalledProcessError as e:
        error_msg = f"Download failed: {e.stderr if e.stderr else 'Unknown error'}\n"
        error_msg += f"Command attempted: yt-dlp -f {format_selector} {video_url}"
        raise RuntimeError(error_msg)

def download_and_parse_subs(video_url, output_path, progress_callback=None):
    """Download and parse subtitles from YouTube video."""
    output_dir = Path(output_path)
    output_template = str(output_dir / "%(id)s.%(ext)s")
    video_id = extract_video_id(video_url)
    
    if not video_id:
        raise ValueError(f"Could not extract video ID from URL: {video_url}\nSupported formats: youtube.com/watch?v=, youtu.be/, shorts/, playlists, etc.")
    
    if progress_callback:
        progress_callback("Getting video information...")
    
    video_title = get_video_title(video_url)
    
    for i, lang in enumerate(CONFIG["languages"]):
        if progress_callback:
            progress_callback(f"Trying language: {lang.upper()} ({i+1}/{len(CONFIG['languages'])})")
        shell_flag = platform.system() == "Windows"
        # 1. Try manual subtitles
        try:
            subprocess.run([
                "yt-dlp",
                "--write-sub",
                "--sub-lang", lang,
                "--skip-download",
                "--sub-format", CONFIG["subtitle_format"],
                "-o", output_template,
                video_url
            ], check=True, capture_output=True, text=True, shell=shell_flag)
        except subprocess.CalledProcessError:
            pass
        subtitle_file = output_dir / f"{video_id}.{lang}.{CONFIG['subtitle_format']}"
        if subtitle_file.exists():
            chosen_file = subtitle_file
        else:
            # 2. Try auto-generated subtitles if manual not found
            try:
                subprocess.run([
                    "yt-dlp",
                    "--write-auto-sub",
                    "--sub-lang", lang,
                    "--skip-download",
                    "--sub-format", CONFIG["subtitle_format"],
                    "-o", output_template,
                    video_url
                ], check=True, capture_output=True, text=True, shell=shell_flag)
            except subprocess.CalledProcessError:
                pass
            subtitle_file_auto = output_dir / f"{video_id}.{lang}.auto.{CONFIG['subtitle_format']}"
            if subtitle_file_auto.exists():
                chosen_file = subtitle_file_auto
            else:
                continue
        if progress_callback:
            progress_callback(f"Processing {lang.upper()} subtitles...")
        with open(chosen_file, encoding="utf-8") as f:
            content = f.read()
        chosen_file.unlink()
        lines = []
        # Improved regex patterns for subtitle parsing
        if CONFIG["subtitle_format"] == "vtt":
            # More robust VTT parsing - handles optional styling and notes
            blocks = re.findall(
                r"(\d{2}:\d{2}:\d{2}\.\d{3})\s+-->\s+\d{2}:\d{2}:\d{2}\.\d{3}.*?\n(.*?)\n\n", 
                content, re.DOTALL)
        else:  # SRT format
            # More robust SRT parsing
            blocks = re.findall(
                r"\d+\n(\d{2}:\d{2}:\d{2},\d{3})\s+-->\s+\d{2}:\d{2}:\d{2},\d{3}.*?\n(.*?)\n\n", 
                content, re.DOTALL)
        for start_time, text in blocks:
            # Normalize time format
            time_str = start_time.replace(',', '.')
            try:
                h, m, s = map(float, time_str.split(":"))
                total_seconds = int(h * 3600 + m * 60 + s)
                ts_str = f"{int(h):02}:{int(m):02}:{int(s):02}"
                yt_url = f"https://www.youtube.com/watch?v={video_id}&t={total_seconds}"
                # Better text cleaning
                clean_text = text.strip().replace('\n', ' ')
                clean_text = re.sub(r'<[^>]*>', '', clean_text)  # Remove HTML tags
                clean_text = re.sub(r'\s+', ' ', clean_text)  # Normalize whitespace
                clean_text = clean_text.strip()
                if clean_text:  # Only add non-empty lines
                    lines.append(f"[{ts_str}]({yt_url}) {clean_text}")
            except (ValueError, IndexError):
                # Skip malformed timestamps
                continue
        # Create markdown content
        header = f"# {video_title}\n\n"
        header += f"**Video ID:** {video_id}  \n"
        header += f"**Language:** {lang.upper()}  \n"
        header += f"**URL:** [Watch on YouTube]({video_url})  \n\n"
        header += "---\n\n"
        transcript = header + "\n".join(lines)
        return transcript, video_id, lang, video_title
    available_langs = ", ".join(CONFIG["languages"])
    raise RuntimeError(f"No subtitles found in any of the configured languages: {available_langs}")

def save_transcript(text, video_id, lang, video_title, output_path):
    """Save transcript to markdown file."""
    clean_title = clean_filename(video_title, max_length=50)
    
    filename = CONFIG["output_filename_format"].format(
        video_id=video_id,
        lang=lang,
        title=clean_title
    )
    
    path = Path(output_path) / filename
    with open(path, "w", encoding="utf-8") as f:
        f.write(text)
    return path

def browse_folder(path_var):
    """Open folder browser dialog."""
    folder = filedialog.askdirectory(initialdir=str(BASE_PATH))
    if folder:
        path_var.set(folder)

def update_progress(message, status_label, root):
    """Update progress label."""
    status_label.config(text=message)
    root.update_idletasks()

def on_submit_subs():
    """Handle subtitle extraction."""
    url = subs_url_entry.get().strip()
    output_path = subs_path_var.get().strip()
    dest = subs_dest_var.get()
    
    if not url:
        messagebox.showerror("Input Error", "Please enter a YouTube URL.")
        return
    
    if dest == "file" and not output_path:
        messagebox.showerror("Input Error", "Please select an output folder.")
        return
    
    def run():
        subs_submit_button.config(state="disabled")
        subs_progress_bar.start()
        
        try:
            transcript, video_id, lang, video_title = download_and_parse_subs(
                url, output_path, 
                progress_callback=lambda msg: update_progress(msg, subs_status_label, root)
            )
            if dest == "clipboard":
                update_progress("Copying to clipboard...", subs_status_label, root)
                root.clipboard_clear()
                root.clipboard_append(transcript)
                update_progress("Copied to clipboard!", subs_status_label, root)
                messagebox.showinfo("Success", 
                    f"Transcript copied to clipboard!\n\nLanguage: {lang.upper()}\nTitle: {video_title}")
            else:
                update_progress("Saving transcript...", subs_status_label, root)
                path = save_transcript(transcript, video_id, lang, video_title, output_path)
                update_progress("Success!", subs_status_label, root)
                messagebox.showinfo("Success", 
                    f"Transcript saved successfully!\n\n"
                    f"File: {path.name}\n"
                    f"Language: {lang.upper()}\n"
                    f"Location: {path}")
        except Exception as e:
            update_progress("Error occurred", subs_status_label, root)
            messagebox.showerror("Error", f"Failed to extract subtitles:\n\n{str(e)}")
        finally:
            subs_submit_button.config(state="disabled" if not yt_dlp_available else "normal")
            subs_progress_bar.stop()
            update_progress("Ready", subs_status_label, root)
    
    threading.Thread(target=run, daemon=True).start()

def on_submit_video():
    """Handle video download."""
    url = video_url_entry.get().strip()
    output_path = video_path_var.get().strip()
    max_res = video_res_var.get()
    
    if not url:
        messagebox.showerror("Input Error", "Please enter a YouTube URL.")
        return
    
    if not output_path:
        messagebox.showerror("Input Error", "Please select an output folder.")
        return
    
    def run():
        video_submit_button.config(state="disabled")
        video_progress_bar.start()
        
        try:
            file_path, video_title = download_video(
                url, output_path, max_res,
                progress_callback=lambda msg: update_progress(msg, video_status_label, root)
            )
            update_progress("Success!", video_status_label, root)
            messagebox.showinfo("Success", 
                f"Video downloaded successfully!\n\n"
                f"Title: {video_title}\n"
                f"File: {file_path.name}\n"
                f"Location: {file_path}")
        except Exception as e:
            update_progress("Error occurred", video_status_label, root)
            messagebox.showerror("Error", f"Failed to download video:\n\n{str(e)}")
        finally:
            video_submit_button.config(state="disabled" if not yt_dlp_available else "normal")
            video_progress_bar.stop()
            update_progress("Ready", video_status_label, root)
    
    threading.Thread(target=run, daemon=True).start()

# ==================== UI ====================
root = tk.Tk()
root.title("YouTube Downloader")
root.geometry("800x700")
root.resizable(True, False)

# Check if yt-dlp is available
yt_dlp_available, version = check_yt_dlp()

main_frame = ttk.Frame(root, padding=20)
main_frame.pack(fill=tk.BOTH, expand=True)

# Title
title_label = ttk.Label(main_frame, text="YouTube Downloader", 
                       font=("TkDefaultFont", 16, "bold"))
title_label.pack(pady=(0, 15))

# Status frame
status_frame = ttk.Frame(main_frame)
status_frame.pack(fill=tk.X, pady=(0, 15))

if yt_dlp_available:
    ttk.Label(status_frame, text=f"✓ yt-dlp found (version: {version})", 
              foreground="green").pack(anchor="w")
else:
    ttk.Label(status_frame, text="⚠ yt-dlp not found. Please install yt-dlp first.", 
              foreground="red").pack(anchor="w")

# Create notebook for tabs
notebook = ttk.Notebook(main_frame)
notebook.pack(fill=tk.BOTH, expand=True, pady=(0, 15))

# ==================== SUBTITLES TAB ====================
subs_frame = ttk.Frame(notebook, padding=15)
notebook.add(subs_frame, text="Subtitles")

# --- Output destination radio buttons ---
subs_dest_var = tk.StringVar(value="clipboard")
dest_frame = ttk.Frame(subs_frame)
dest_frame.pack(fill=tk.X, pady=(0, 10))
ttk.Label(dest_frame, text="Output destination:").pack(side=tk.LEFT, padx=(0, 10))
clipboard_radio = ttk.Radiobutton(dest_frame, text="To Clipboard", variable=subs_dest_var, value="clipboard")
clipboard_radio.pack(side=tk.LEFT)
file_radio = ttk.Radiobutton(dest_frame, text="To File", variable=subs_dest_var, value="file")
file_radio.pack(side=tk.LEFT, padx=(10, 0))

# Configuration display for subtitles
subs_config_frame = ttk.LabelFrame(subs_frame, text="Subtitle Configuration", padding=10)
subs_config_frame.pack(fill=tk.X, pady=(0, 15))

langs_text = "Languages (in order): " + " → ".join(CONFIG["languages"])
ttk.Label(subs_config_frame, text=langs_text).pack(anchor="w")
ttk.Label(subs_config_frame, text=f"Format: {CONFIG['subtitle_format'].upper()}").pack(anchor="w")

# URL input for subtitles
subs_url_frame = ttk.LabelFrame(subs_frame, text="YouTube URL", padding=10)
subs_url_frame.pack(fill=tk.X, pady=(0, 10))

subs_url_entry = ttk.Entry(subs_url_frame, font=("TkDefaultFont", 10))
subs_url_entry.pack(fill=tk.X)

# Output path for subtitles
subs_path_frame = ttk.LabelFrame(subs_frame, text="Output Folder", padding=10)
subs_path_frame.pack(fill=tk.X, pady=(0, 15))

subs_path_input_frame = ttk.Frame(subs_path_frame)
subs_path_input_frame.pack(fill=tk.X)

subs_path_var = tk.StringVar(value=str(BASE_PATH))
subs_path_entry = ttk.Entry(subs_path_input_frame, textvariable=subs_path_var)
subs_path_entry.pack(side=tk.LEFT, fill=tk.X, expand=True)

subs_browse_button = ttk.Button(subs_path_input_frame, text="Browse", 
                               command=lambda: browse_folder(subs_path_var))
subs_browse_button.pack(side=tk.RIGHT, padx=(10, 0))

# Submit button for subtitles
subs_submit_button = ttk.Button(subs_frame, text="Extract Subtitles", command=on_submit_subs)
subs_submit_button.pack(pady=15)

if not yt_dlp_available:
    subs_submit_button.config(state="disabled")

# Progress bar for subtitles
subs_progress_bar = ttk.Progressbar(subs_frame, mode='indeterminate')
subs_progress_bar.pack(fill=tk.X, pady=(0, 5))

# Status label for subtitles
subs_status_label = ttk.Label(subs_frame, text="Ready", font=("TkDefaultFont", 10))
subs_status_label.pack(pady=(0, 0))  # Reduced padding between "Ready" and Instructions

# ==================== VIDEO TAB ====================
video_frame = ttk.Frame(notebook, padding=15)
notebook.add(video_frame, text="Video Download")

# Configuration display for video
video_config_frame = ttk.LabelFrame(video_frame, text="Video Configuration", padding=10)
video_config_frame.pack(fill=tk.X, pady=(0, 15))

ttk.Label(video_config_frame, text=f"Max Resolution: {CONFIG['max_resolution']}p").pack(anchor="w")
ttk.Label(video_config_frame, text=f"Format: {CONFIG['video_format'].upper()}").pack(anchor="w")

# Resolution selector
res_frame = ttk.Frame(video_config_frame)
res_frame.pack(fill=tk.X, pady=(5, 0))

ttk.Label(res_frame, text="Max Resolution:").pack(side=tk.LEFT)
video_res_var = tk.StringVar(value=CONFIG["max_resolution"])
res_combo = ttk.Combobox(res_frame, textvariable=video_res_var, values=["1080", "720", "480", "360"],
                        state="readonly", width=10)
res_combo.pack(side=tk.LEFT, padx=(10, 0))

# URL input for video
video_url_frame = ttk.LabelFrame(video_frame, text="YouTube URL", padding=10)
video_url_frame.pack(fill=tk.X, pady=(0, 10))

video_url_entry = ttk.Entry(video_url_frame, font=("TkDefaultFont", 10))
video_url_entry.pack(fill=tk.X)

# Output path for video
video_path_frame = ttk.LabelFrame(video_frame, text="Output Folder", padding=10)
video_path_frame.pack(fill=tk.X, pady=(0, 15))

video_path_input_frame = ttk.Frame(video_path_frame)
video_path_input_frame.pack(fill=tk.X)

video_path_var = tk.StringVar(value=str(BASE_PATH))
video_path_entry = ttk.Entry(video_path_input_frame, textvariable=video_path_var)
video_path_entry.pack(side=tk.LEFT, fill=tk.X, expand=True)

video_browse_button = ttk.Button(video_path_input_frame, text="Browse", 
                                command=lambda: browse_folder(video_path_var))
video_browse_button.pack(side=tk.RIGHT, padx=(10, 0))

# Submit button for video
video_submit_button = ttk.Button(video_frame, text="Download Video", command=on_submit_video)
video_submit_button.pack(pady=15)

if not yt_dlp_available:
    video_submit_button.config(state="disabled")

# Progress bar for video
video_progress_bar = ttk.Progressbar(video_frame, mode='indeterminate')
video_progress_bar.pack(fill=tk.X, pady=(0, 5))

# Status label for video
video_status_label = ttk.Label(video_frame, text="Ready", font=("TkDefaultFont", 10))
video_status_label.pack(pady=(5, 0))

# ==================== INSTRUCTIONS ====================
instructions_frame = ttk.LabelFrame(main_frame, text="Quick Instructions", padding=10)
instructions_frame.pack(fill=tk.X, pady=(2, 0))  # Minimal padding at top

instructions_text = """• Subtitles: Extract subtitles as markdown files for Obsidian
• Video Download: Download videos with customizable resolution limit
• Supports: Regular URLs, youtu.be/, shorts/, playlists, mobile URLs
• Make sure yt-dlp is installed for both features to work"""

if platform.system() == "Linux":
    # Use a scrollable, read-only Text widget for instructions
    instr_text_widget = tk.Text(instructions_frame, height=6, wrap=tk.WORD, font=("TkDefaultFont", 10), state="normal", bg=root.cget('bg'), relief=tk.FLAT, borderwidth=0)
    instr_text_widget.insert("1.0", instructions_text)
    instr_text_widget.config(state="disabled")
    instr_text_widget.pack(fill=tk.BOTH, expand=True, side=tk.LEFT)
    instr_scroll = ttk.Scrollbar(instructions_frame, command=instr_text_widget.yview)
    instr_text_widget.config(yscrollcommand=instr_scroll.set)
    instr_scroll.pack(side=tk.RIGHT, fill=tk.Y)
else:
    ttk.Label(instructions_frame, text=instructions_text, justify=tk.LEFT).pack(anchor="w")

# --- Linux: Double-click select-all for Entry widgets ---
if platform.system() == "Linux":
    def select_all(event):
        event.widget.select_range(0, 'end')
        event.widget.icursor('end')
        return 'break'
    for entry in [subs_url_entry, video_url_entry, subs_path_entry, video_path_entry]:
        entry.bind('<Double-Button-1>', select_all)

root.mainloop()

Edit 160625: Made some adjustments in the script. If you experience “subs not found,” and you know they should be found as you 1. uploaded to your own YT account, 2. Published the auto-generated sub for the language, try to set a different folder and back to Downloads again if that’s where you want to download to.

Edit 260625: Downloads subs to clipboard now (faster for all of us) unless button for ‘To File’ is pressed.
Other bug and quality fixes have been made.

BTW, I have seen the WebClipper widget being smaller in some themes in Obsidian…?
Exchange <iframe height="315" with <iframe width="100%" height="315" in your template(s), and then replace all <iframe height="315" occurrences with <iframe width="100%" height="315" in an external text editor (Notepad++ or anything else).

Some other notes and upgrades of interest

You can add to your prompts text like: “Detect when speakers shows slides or drawing on a board and mark these with ==Add screenshot here==.”
Pretty good.


I tried DeepSeek free API from Openrouter today.
Looked promising but allows maximum tokens that is 1/6th of Gemini Flash Thinking Exp for a context window. May be okay for 1 hr videos or less.


In this recent writeup, I offered a different YT player to go with your AI summary work:

Useful for adding more details AI may have missed, or learning about the topics with video context available, so you can even jot down some of your own ideas while the video plays. Window can be dragged and pinned.

I just wanted to let you know your solution is completely awesome. It took me a few goes to get it going, and round tripping via private youtube and waiting for transcripts seems like a pain, however the results are spectacular. Your copilot synthesis prompt with the visuals and colour coding, awesome a lot of reuse. I ended up forking out to de watermark that yt downloader app, and I now added firefox to something needed on my computers.
Thanks for taking the time to put it out there, and take feedback and refinement :slight_smile:

1 Like

Another update here from OP.

I have come here to draw attention to some quirks I’m seeing and how to tackle them.

1

Lately, I’ve been seeing cases where Gemini jumps over a whole hour in the output of summaries.
Often around the [00:08:00] - [00:11:00] mark.
Even if I tried to draw attention to this in an updated prompt, it will happen.

Solutions

  • My statistics are hardly anything to go by but it seems as if the 1219 model is less prone to this than the others.
  • You can try to keep the output and prompt Gemini in the same chat session to rectify the missing hour problem. Sometimes this works.
  • It’s better to start again in a new chat session it seems.
  • You can try with a different model, like Deepseek if it can incorporate the whole text plus your prompt, and then switch to Google to tell it “You didn’t give me a comprehensive summary last time. I need a more elaborate version.”

2

I kind of promised or rather just alluded to a post processor search and replace solution…to a problem that never really existed at the time but it creeps in more often than not.

Mind you I don’t do Templater scripts anymore, so the following is a Typescript script.
Save it as Transcript-Summary-Cleanup.ts or any name you can remember later:

import * as obsidian from 'obsidian';

// 250325 Update: Script now can be used for subsequent runs to update content based on newly added replacement rules

// Simple string-based extractors to avoid complex regex
const extractVideoId = (content: string): string | null => {
    // Extract source line
    const lines = content.split('\n');
    let sourceLine = '';
    
    for (const line of lines) {
        if (line.trim().startsWith('source:')) {
            sourceLine = line;
            break;
        }
    }
    
    if (!sourceLine) return null;
    
    // Handle YouTube URLs
    if (sourceLine.includes('youtube.com/watch?v=')) {
        const parts = sourceLine.split('youtube.com/watch?v=');
        if (parts.length > 1) {
            const id = parts[1].split(/[&"\s]/)[0];
            console.log("Found YouTube video ID:", id);
            return id;
        }
    }
    
    if (sourceLine.includes('youtu.be/')) {
        const parts = sourceLine.split('youtu.be/');
        if (parts.length > 1) {
            const id = parts[1].split(/[&"\s]/)[0];
            console.log("Found YouTube video ID:", id);
            return id;
        }
    }
    
    // Handle Rumble URLs
    if (sourceLine.includes('rumble.com/')) {
        const parts = sourceLine.split('rumble.com/');
        if (parts.length > 1) {
            // Extract the full path after rumble.com/
            const path = parts[1].split(/["'\s]/)[0];
            console.log("Found Rumble path:", path);
            return path; // Return full path for Rumble
        }
    }
    
    // Handle Videa URLs
    if (sourceLine.includes('videa.hu/videok/')) {
        const parts = sourceLine.split('videa.hu/videok/');
        if (parts.length > 1) {
            const pathParts = parts[1].split('?')[0].split(/["'\s]/)[0];
            console.log("Found Videa path:", pathParts);
            return pathParts;
        }
    }
    
    return null;
};

// Helper to determine video platform
const getVideoPlatform = (content: string): { platform: 'youtube' | 'rumble' | 'videa' | null, id: string | null } => {
    // Extract source line
    const lines = content.split('\n');
    let sourceLine = '';
    
    for (const line of lines) {
        if (line.trim().startsWith('source:')) {
            sourceLine = line;
            break;
        }
    }
    
    // Check for YouTube in source
    if (sourceLine.includes('youtube.com') || sourceLine.includes('youtu.be')) {
        return { platform: 'youtube', id: extractVideoId(content) };
    }
    
    // Check for Rumble in source
    if (sourceLine.includes('rumble.com')) {
        return { platform: 'rumble', id: extractVideoId(content) };
    }

    // Check for Videa in source
    if (sourceLine.includes('videa.hu')) {
        return { platform: 'videa', id: extractVideoId(content) };
    }
    
    return { platform: null, id: null };
};

// Parse timestamp to seconds
const parseTimestamp = (timestamp: string): number => {
    const parts = timestamp.split(':').map(Number);
    if (parts.length === 2) {
        return parts[0] * 60 + parts[1];
    } else if (parts.length === 3) {
        return parts[0] * 3600 + parts[1] * 60 + parts[2];
    }
    return 0;
};

// Convert seconds to timestamp format
const secondsToTimestamp = (seconds: number): string => {
    const hours = Math.floor(seconds / 3600);
    const minutes = Math.floor((seconds % 3600) / 60);
    const secs = seconds % 60;
    
    if (hours > 0) {
        return `${hours.toString().padStart(2, '0')}:${minutes.toString().padStart(2, '0')}:${secs.toString().padStart(2, '0')}`;
    } else {
        return `${minutes.toString().padStart(2, '0')}:${secs.toString().padStart(2, '0')}`;
    }
};

// NEW FUNCTION: Fix incorrect timestamps based on URL seconds parameter
const fixIncorrectTimestamps = (content: string): string => {
    let result = content;
    
    // Match timestamp links with pattern [HH:MM:SS](URL with t= or start= parameter)
    const timestampLinkRegex = /\[(\d{1,2}:\d{2}(?::\d{2})?)\]\((https?:\/\/[^)]*?[?&](?:t|start)=(\d+)[^)]*?)\)/g;
    
    let match;
    const replacements = [];
    
    while ((match = timestampLinkRegex.exec(result)) !== null) {
        const [fullMatch, displayedTimestamp, url, secondsParam] = match;
        const seconds = parseInt(secondsParam);
        const correctTimestamp = secondsToTimestamp(seconds);
        
        // Only replace if the displayed timestamp doesn't match the URL seconds
        if (displayedTimestamp !== correctTimestamp) {
            console.log(`Fixing timestamp: [${displayedTimestamp}] -> [${correctTimestamp}] (${seconds} seconds)`);
            replacements.push({
                original: fullMatch,
                fixed: `[${correctTimestamp}](${url})`
            });
        }
    }
    
    // Apply all replacements
    for (const replacement of replacements) {
        result = result.replace(replacement.original, replacement.fixed);
    }
    
    return result;
};

// Fix existing timestamp links with simpler string operations
const fixExistingTimestampLinks = (content: string, platform: 'youtube' | 'rumble' | 'videa' | null, id: string | null): string => {
    if (!id) return content;
    
    let result = content;
    const lines = content.split('\n');
    
    for (let i = 0; i < lines.length; i++) {
        const line = lines[i];
        
        // Look for timestamp patterns like [01:08](any_url)
        if (line.includes('[') && line.includes('](')) {
            
            // Collect all timestamp link matches in this line
            let modifiedLine = line;
            let bracketIndex = modifiedLine.indexOf('[');
            
            while (bracketIndex !== -1) {
                const closeBracketIndex = modifiedLine.indexOf(']', bracketIndex);
                if (closeBracketIndex === -1) break;
                
                const timestampText = modifiedLine.substring(bracketIndex + 1, closeBracketIndex);
                // Check if this is a timestamp format
                if (/^\d{1,2}:\d{2}(:\d{2})?$/.test(timestampText)) {
                    const linkOpenIndex = modifiedLine.indexOf('(', closeBracketIndex);
                    if (linkOpenIndex === -1) break;
                    
                    const linkCloseIndex = modifiedLine.indexOf(')', linkOpenIndex);
                    if (linkCloseIndex === -1) break;
                    
                    // Extract the old URL and seconds
                    const oldUrl = modifiedLine.substring(linkOpenIndex + 1, linkCloseIndex);
                    let seconds = 0;
                    
                    // Extract seconds from any supported platform URL
                    if (oldUrl.includes('youtube.com') || oldUrl.includes('youtu.be')) {
                        const timeMatches = oldUrl.match(/[?&]t=(\d+)/);
                        if (timeMatches && timeMatches[1]) {
                            seconds = parseInt(timeMatches[1]);
                        } else {
                            seconds = parseTimestamp(timestampText);
                        }
                    } else if (oldUrl.includes('rumble.com')) {
                        const timeMatches = oldUrl.match(/[?&]start=(\d+)/);
                        if (timeMatches && timeMatches[1]) {
                            seconds = parseInt(timeMatches[1]);
                        } else {
                            seconds = parseTimestamp(timestampText);
                        }
                    } else if (oldUrl.includes('videa.hu')) {
                        const timeMatches = oldUrl.match(/[?&]start=(\d+)/);
                        if (timeMatches && timeMatches[1]) {
                            seconds = parseInt(timeMatches[1]);
                        } else {
                            seconds = parseTimestamp(timestampText);
                        }
                    } else {
                        seconds = parseTimestamp(timestampText);
                    }
                    
                    let newUrl = '';
                    
                    // Convert to the target platform format
                    if (platform === 'youtube') {
                        newUrl = `https://www.youtube.com/watch?v=${id}&t=${seconds}`;
                    } else if (platform === 'rumble') {
                        newUrl = `https://rumble.com/${id}${seconds > 0 ? `?start=${seconds}` : ''}`;
                    } else if (platform === 'videa') {
                        newUrl = `https://videa.hu/videok/${id}${seconds > 0 ? `?start=${seconds}` : ''}`;
                    }
                    
                    // Replace the entire link if we have a new URL
                    if (newUrl) {
                        const oldLink = modifiedLine.substring(bracketIndex, linkCloseIndex + 1);
                        const newLink = `[${timestampText}](${newUrl})`;
                        modifiedLine = modifiedLine.substring(0, bracketIndex) + 
                                      newLink + 
                                      modifiedLine.substring(linkCloseIndex + 1);
                    }
                }
                
                // Find next bracket
                bracketIndex = modifiedLine.indexOf('[', bracketIndex + 1);
            }
            
            lines[i] = modifiedLine;
        }
    }
    
    return lines.join('\n');
};

// Link unlinked timestamps
const linkUnlinkedTimestamps = (content: string, platform: 'youtube' | 'rumble' | 'videa' | null, id: string | null): string => {
    if (!id) return content;
    
    let result = content;
    const lines = content.split('\n');
    
    for (let i = 0; i < lines.length; i++) {
        const line = lines[i];
        let modifiedLine = line;
        
        // Match timestamp ranges like [01:08-01:45]
        const rangeMatches = [];
        const rangeRegex = /\[(\d{1,2}:\d{2})-(\d{1,2}:\d{2})\](?!\()/g;
        let match;
        
        while ((match = rangeRegex.exec(modifiedLine)) !== null) {
            rangeMatches.push({
                fullMatch: match[0],
                startTime: match[1],
                endTime: match[2],
                index: match.index
            });
        }
        
        // Process matches in reverse to avoid index shifting
        for (let j = rangeMatches.length - 1; j >= 0; j--) {
            const { fullMatch, startTime, endTime, index } = rangeMatches[j];
            const seconds = parseTimestamp(startTime);
            
            let newUrl = '';
            if (platform === 'youtube') {
                newUrl = `https://www.youtube.com/watch?v=${id}&t=${seconds}`;
            } else if (platform === 'rumble') {
                newUrl = `https://rumble.com/${id}${seconds > 0 ? `?start=${seconds}` : ''}`;
            } else if (platform === 'videa') {
                newUrl = `https://videa.hu/videok/${id}${seconds > 0 ? `?start=${seconds}` : ''}`;
            }
            
            if (newUrl) {
                const replacement = `[${startTime}-${endTime}](${newUrl})`;
                modifiedLine = modifiedLine.substring(0, index) + 
                              replacement + 
                              modifiedLine.substring(index + fullMatch.length);
            }
        }
        
        // Match simple timestamps like [01:08]
        const simpleMatches = [];
        const simpleRegex = /\[(\d{1,2}:\d{2})\](?!\()/g;
        
        while ((match = simpleRegex.exec(modifiedLine)) !== null) {
            simpleMatches.push({
                fullMatch: match[0],
                timestamp: match[1],
                index: match.index
            });
        }
        
        // Process matches in reverse to avoid index shifting
        for (let j = simpleMatches.length - 1; j >= 0; j--) {
            const { fullMatch, timestamp, index } = simpleMatches[j];
            const seconds = parseTimestamp(timestamp);
            
            let newUrl = '';
            if (platform === 'youtube') {
                newUrl = `https://www.youtube.com/watch?v=${id}&t=${seconds}`;
            } else if (platform === 'rumble') {
                newUrl = `https://rumble.com/${id}${seconds > 0 ? `?start=${seconds}` : ''}`;
            } else if (platform === 'videa') {
                newUrl = `https://videa.hu/videok/${id}${seconds > 0 ? `?start=${seconds}` : ''}`;
            }
            
            if (newUrl) {
                const replacement = `[${timestamp}](${newUrl})`;
                modifiedLine = modifiedLine.substring(0, index) + 
                              replacement + 
                              modifiedLine.substring(index + fullMatch.length);
            }
        }
        
        lines[i] = modifiedLine;
    }
    
    return lines.join('\n');
};

const applyStringReplacements = (content: string): string => {
    let updatedContent = content;
    const replacements = [
        // Convert smart quotes to regular quotes
        {
            from: /[„”]/g,
            to: '"'
        },
        {
            from: /<sil>/g,
            to: ''
        },
        // Add more mistakes you find along the way
        {
            from: /placeholder1a/gi,
            to: 'placeholder1a'
        },
        // Exchange faulty ending bracket (sometimes when Google is fast, this error happens)
        {
            from: /(\[(?:\d{1,2}:)?\d{2}:\d{2}\]\(https?:\/\/[^)]*?)\]/g,
            to: '$1)'
        },
        // Add newlines before callouts - simplified
        {
            from: /([^\n])\n(>\s*\[!(?:check|important|fail|note|question|example)\])/g,
            to: '$1\n\n$2'
        },
        // Remove any markdown code blocks and trailing backticks
        {
            from: /```(?:markdown)?\n?([\s\S]*?)(?:```|$)/g,
            to: '$1'
        },
        // Deordinalize numbers 1,2
        {
            from: /^(>\s{1}[0-9]{1,5})([.])/gm,
            to: '$1\\$2'
        },
        {
            from: /^([0-9]{1,5})([.])/gm,
            to: '$1\\$2'
        },
        // Add this to the replacements array in applyStringReplacements:
        {
            from: /\[(\d{1,2}(?::\d{2}){1,2})\](?!\()/g,
            to: (match, timestamp) => {
                // Find any existing timestamp link in the content to use as template
                const templateMatch = updatedContent.match(/\[(\d{1,2}(?::\d{2}){1,2})\]\((https?:\/\/[^\s)]+)\)/);
                if (templateMatch) {
                    const [_, templateTime, templateUrl] = templateMatch;
                    // Extract base URL and time parameter
                    const urlBase = templateUrl.split('?')[0];
                    const seconds = parseTimestamp(timestamp);
                    
                    // Determine time parameter format based on URL
                    let timeParam = '';
                    if (templateUrl.includes('youtube.com') || templateUrl.includes('youtu.be')) {
                        timeParam = `?t=${seconds}`;
                    } else if (templateUrl.includes('rumble.com')) {
                        timeParam = `?start=${seconds}`;
                    } else if (templateUrl.includes('videa.hu')) {
                        timeParam = `?start=${seconds}`;
                    }
                    
                    return `[${timestamp}](${urlBase}${timeParam})`;
                }
                return match; // Keep original if no template found
            }
        }
    ];

    for (const {from, to} of replacements) {
        // Convert string or regex to RegExp if needed
        const regex = from instanceof RegExp ? from : new RegExp(from);
        
        // Test if there are any matches before replacing
        if (regex.test(updatedContent)) {
            console.log(`Found matches for pattern: ${regex}`);
            // Reset regex lastIndex
            regex.lastIndex = 0;
            // Apply replacement
            updatedContent = updatedContent.replace(regex, to);
        }
    }
    
    return updatedContent;
};

// Helper to convert SBV time string to seconds
function sbvTimeToSeconds(time: string): number {
    // Handles SS.mmm, M:SS.mmm, MM:SS.mmm, H:MM:SS.mmm
    const parts = time.split(':');
    if (parts.length === 3) {
        return parseInt(parts[0]) * 3600 + parseInt(parts[1]) * 60 + parseFloat(parts[2].replace(',', '.'));
    } else if (parts.length === 2) {
        return parseInt(parts[0]) * 60 + parseFloat(parts[1].replace(',', '.'));
    } else if (parts.length === 1) {
        return parseFloat(parts[0].replace(',', '.'));
    }
    return 0;
}

// Helper to convert SRT time string to seconds
function srtTimeToSeconds(hh: string, mm: string, ss: string, ms: string): number {
    return parseInt(hh) * 3600 + parseInt(mm) * 60 + parseInt(ss);
}

const transcriptSummaryCleanup = async (app: obsidian.App): Promise<void> => {
    const currentFile = app.workspace.getActiveFile();
    if (!currentFile) return;

    let fileContent = await app.vault.read(currentFile);
    const videoId = extractVideoId(fileContent);

    let transcriptConverted = false;

    // --- SUBTITLE BLOCK CLEANUP (SBV, SRT, VTT) ---
    // SBV handler: [hh:mm:ss](YouTube link) text for each SBV block
    const sbvBlockRegex = /^([\d:]+\.\d{3}),[\d:]+\.\d{3}\r?\n([\s\S]*?)(?=^[\d:]+\.\d{3},[\d:]+\.\d{3}|$)/gm;
    if (sbvBlockRegex.test(fileContent)) {
        transcriptConverted = true;
        fileContent = fileContent.replace(sbvBlockRegex, (match, timestamp, text) => {
            const seconds = Math.floor(sbvTimeToSeconds(timestamp));
            const hhmmss = secondsToTimestamp(seconds);
            const lineText = text.replace(/\r?\n/g, ' ').trim();
            return `[${hhmmss}](https://www.youtube.com/watch?v=${videoId}&t=${seconds}) ${lineText}\n`;
        });
    }

    // SRT handler: [hh:mm:ss](YouTube link) text for each SRT block
    const srtBlockRegex = /^\d+\r?\n(\d{2}:\d{2}:\d{2}),(\d{3}) --> [^\r\n]+\r?\n([\s\S]*?)(?=^\d+\r?\n\d{2}:\d{2}:\d{2},\d{3} --> |$)/gm;
    if (srtBlockRegex.test(fileContent)) {
        transcriptConverted = true;
        fileContent = fileContent.replace(srtBlockRegex, (match, timestamp, ms, text) => {
            const seconds = Math.floor(srtTimeToSeconds(...timestamp.split(':'), ms));
            const hhmmss = secondsToTimestamp(seconds);
            const lineText = text.replace(/\r?\n/g, ' ').trim();
            return `[${hhmmss}](https://www.youtube.com/watch?v=${videoId}&t=${seconds}) ${lineText}\n`;
        });
    }

    // VTT handler: preserve everything above WEBVTT, keep first 3 lines as anchor, process transcript after, then remove anchor lines
    if (/^WEBVTT/m.test(fileContent)) {
        transcriptConverted = true;
        const webvttIdx = fileContent.indexOf('WEBVTT');
        const beforeVtt = fileContent.slice(0, webvttIdx);
        const vttAndAfter = fileContent.slice(webvttIdx);
        const vttLinesArr = vttAndAfter.split(/\r?\n/);
        const transcriptContent = vttLinesArr.slice(3).join('\n');
        const vttBlockRegex = /^(\d{2}:\d{2}:\d{2}\.\d{3}) --> [^\r\n]+\r?\n([\s\S]*?)(?=^\d{2}:\d{2}:\d{2}\.\d{3} --> |$)/gm;
        let vttLines: string[] = [];
        transcriptContent.replace(vttBlockRegex, (match, timestamp, text) => {
            const timeParts = timestamp.split(':');
            const seconds = Math.floor(parseInt(timeParts[0]) * 3600 + parseInt(timeParts[1]) * 60 + parseFloat(timeParts[2]));
            const hhmmss = secondsToTimestamp(seconds);
            let lineText = text.replace(/<[^>]+>/g, '').replace(/\r?\n/g, ' ').trim();
            if (!lineText) return '';
            vttLines.push(`[${hhmmss}](https://www.youtube.com/watch?v=${videoId}&t=${seconds}) ${lineText}`);
            return '';
        });
        vttLines = vttLines.filter(line => !/^[\[]\d{2}:\d{2}(?::\d{2})?\][^)]*\)\s*$/.test(line));
        let deduped: string[] = [];
        let lastText = '';
        for (const line of vttLines) {
            const textPart = line.replace(/^\[\d{2}:\d{2}(?::\d{2})?\]\([^)]*\)\s*/, '');
            if (textPart && textPart !== lastText) {
                deduped.push(line);
                lastText = textPart;
            }
        }
        fileContent = beforeVtt + deduped.join('\n') + '\n';
    }

    // Remove double newlines from the area starting with the first clickable timestamp line to the end
    if (transcriptConverted) {
        const firstClickableIdx = fileContent.search(/^\[/m);
        if (firstClickableIdx !== -1) {
            const before = fileContent.slice(0, firstClickableIdx);
            const after = fileContent.slice(firstClickableIdx).replace(/\n\n+/g, '\n');
            fileContent = before + after;
        }
    }

    // --- AI summary cleaner and string replacements follow as before ---
    let updatedContent = applyStringReplacements(fileContent);
    updatedContent = fixIncorrectTimestamps(updatedContent);
    const { platform, id } = getVideoPlatform(updatedContent);
    console.log("Detected video platform:", platform, "with ID:", id);
    if (platform && id) {
        updatedContent = fixExistingTimestampLinks(updatedContent, platform, id);
        updatedContent = linkUnlinkedTimestamps(updatedContent, platform, id);
    }
    await app.vault.modify(currentFile, updatedContent);
};

export class TranscriptSummaryCleanupPlugin extends obsidian.Plugin {
    async onload() {
        this.addCommand({
            id: 'transcript-summary-cleanup',
            name: 'Transcript Summary Cleanup',
            callback: async () => await transcriptSummaryCleanup(this.app)
        });
    }
}

export async function invoke(app: obsidian.App): Promise<void> {
    return transcriptSummaryCleanup(app);
}

In your File Manager/Explorer save it in the folder you specify in the settings of CodeScript Toolkit you need to install and enable.

  • You can find some info on how to do this in the top part of the guide I share here.

That plugin will handle running the script.


Transcript Summary Cleanup Script – Quick Guide

What does this do?
This script fixes up messy transcript summaries in your Obsidian notes, especially those with time-stamped links to YouTube, Rumble or Videa videos.

1. Fixes Broken Timestamp Links

Ever see stuff like this in your notes?

[01:23](]

or

[01:23]

instead of a proper clickable link?

This script:

  • Fixes broken links like [01:23](] and turns them into real links.
  • Finds orphan timestamps like [01:23] and makes them clickable, using the video link from your source: line at the top of the note.

2. Makes All Timestamps Clickable

If you set up WebClipper in the first post as I showed, you will have a source property value with a URL, which is what this script will use to rebuild the orphaned links. No need to re-prompt AI for this, as this is easy computing.

source: https://www.youtube.com/watch?v=abc123

and a bunch of [01:23] stamps with no actual links, the script will turn them into:

[01:23](https://www.youtube.com/watch?v=abc123&t=83)

It works for YouTube, Rumble and Videa (didn’t test this last one though).


3. Fixes Displayed Timestamps

If you have a link like:

[01:23](https://www.youtube.com/watch?v=abc123&t=99)

but the timestamp and the link don’t match, the script will fix the label to match the actual time in the link.


4. Other Cleanups

  • Converts smart quotes to regular quotes if you can dig that.
  • Removes weird <sil> tags that break formatting and stray code blocks.
  • (If you want more, add your own rules!)

How to Use

  1. Do the steps above (install plugin, save script in folder).
  2. Position your cursor anywhere in the markdown file with your summary.
  3. Run the “Transcript-Summary-Cleanup.ts” command.

That’s it.
If you spot a summary with broken or non-clickable timestamps, just run this script and it’ll sort things out.
If you want more fixes you see along the way, add your own rules in the applyStringReplacements section.

        // Add more mistakes you find along the way
        {
            from: /placeholder1a/gi,
            to: 'placeholder1a'
        },

This is handy when you often do summaries from the same performer or channel and the same names or expressions crop up with the same misspellings and you do not want to rectify these one by one.


UPDATE

Update on ts file!

Sometimes you cannot get the clickable timestamps in any way (???)…
Then you need to go to YouTube and download your subs in the format it allows: srt, sbv or vtt.

Download any one of these and copy the content of your markdown file.
Make sure your source property has the Youtube link from the Webclipper.

Now you can run this .ts helper file before we even have had any AI summary come in.
The script can parse your vtt/sbv/srt file content (make you sure copy the full file contents into your md file) and convert to the desired format: text with clickable timestamps.
If it’s vtt you downloaded, make sure you have:

WEBVTT
Kind: captions
Language: [whatever]

on top.

Vtt conversion in action:
Obsidian_OhcaqOXYtE

Then, yes, this same file can be used to do clean up with once the summary is brought in.

A verified solution was to re-write the prompt (here I give a rough English version on my native one):

Take your time to thoroughly analyze the content – we have all the time in the world. Summarize {activeNote} in Markdown format. Use appropriate H1 to H4 headers.

### QUALITY ASSURANCE:

#### **CRITICAL: SEQUENTIAL PROCESSING RULE**
**MANDATORY**: Process the transcript in strict chronological order from start to finish. Never skip time periods.
- Start with the earliest timestamp (usually 00:00:xx)
- Process each timestamp in sequence: 00:01:xx, 00:02:xx, etc.
- When reaching 00:59:xx, the next timestamp should be 01:00:xx (not 01:59:xx)
- **NEVER jump from early minutes (00:0x:xx) directly to later hours (01:0x:xx)**
- If you reference a later point, return immediately to where you left off chronologically

**Self-Check**: Before writing each section, verify the timestamp follows logically from the previous one.
**BEFORE processing each timestamp, verify:**
1. Does this timestamp come IMMEDIATELY after the previous one chronologically?
2. If there's a gap longer than 5 minutes, STOP and state "Detected timestamp gap - please verify sequence"
3. Example: 00:08:45 → 00:12:30 ✓ (normal gap) | 00:08:45 → 01:12:30 ✗ (invalid jump)

### Priming

The transcripts come from lectures and presentations done by art historians, historians, archeologists, linguists, astronomers, tech experts and various researchers where presenters may show slides, draw on boards, etc.  
If you can detect where potential screenshots from the videos I should manually insert, add:  
==**SCREENSHOT HERE:**==
to highlight the area for me to add a screenshot to.  

Insert callout boxes where relevant to expand context. Use the following types exactly as described (both type and content of box must be preceded with `> ` as you can see) and **always** keeping empty lines between these elements:

> [!check]
> For key insights or useful information.

> [!important]
> For critical points or must-remember details.

> [!fail]
> For warnings, dangers, or failures.

> [!note]
> For notable quotes or poetic/philosophical reflections.

> [!question]
> For thought-provoking or discussion-worthy questions.

> [!example]
> For extended exploration, implications, and creative ideas.

Distinguish **question** and **example** boxes: Question relates closely to the topic, while Example allows deeper, more philosophical or spiritual extensions.

### Formatting & Processing Rules:
- Ensure logical flow.
- Avoid redundant introductions and outros.
- Ignore irrelevant sections (disclaimers, social media plugs, intro/outro without subject relevance).
- Must preserve clickable timestamps for reference.
- Ensure proper Markdown rendering: Every callout box **must** be separated from other content and other callouts by exactly ONE blank line, both before and after.  
- Summary length should match content depth - expand when meaningful, condense when generic.
	- It is important though to draw attention to potential slides or drawings the presenter displays or creates and in addition to drawing my attention to Screenshots I should make, it is important to properly analyze these subjects as well, because if the presenter (art historians, archeologists, etc.) found it important to show something, it is important for us too.
	Take your time to work through the content systematically from beginning to end. Organize headings based on the natural progression of topics as they appear, rather than attempting to reorganize by theme.

### PROCESSING CHECKPOINT:
Before creating final sections, verify:
- [ ] Covered timestamps from 00:00:xx to final timestamp without hour-long gaps
- [ ] No jumps from 00:0x:xx directly to 01:0x:xx or 02:0x:xx
- [ ] All referenced content has been properly processed in sequence

### OUTPUT INSTRUCTIONS:
- Use only Markdown and consistently.
- Ensure proper sentence capitalization (uppercase letters **must** start a sentence!).
- Output only the requested sections - no additional notes or warnings.

- Maintain efficiency and consistency throughout. Use timestamps with exact clickable format as they appear in the text so each notion has its easily clicked reference.  
	- Make sure the stamps are correct: they cannot extend the length of the videos!!!
- Do NOT use escape characters before double quotes. 

### FINAL SECTIONS:
At the end, create these structured bullet-point sections:

**IDEAS** - Minimum 10 distilled key insights.
**QUOTES** - Minimum 10 impactful direct quotes. Here, use capital letters to start sentences and double quotes, not backticks!
Make sure you also fix obvious spelling errors here; we should not copy the transcribed line word for word as it may contain syntax errors.
**TIPS** - Minimum 10 practical takeaways or awareness raisers.
**REFERENCES** - List of relevant books, tools, projects, people and inspirations.
You can use nested lists with bolded Individuals, Books/Works, Locations, Inspirations, Concepts/Symbols, etc., with the examples below, like so:  
- **Individuals**:
	- William Shakespeare (Playwright)
	- Queen Elizabeth I (Monarch)
	- Jane Austen (Novelist)
	- Winston Churchill (Statesman)
- **Books/Works/Lectures**:
	- William Shakespeare: *Hamlet*, *Romeo and Juliet*
	- Jane Austen: *Pride and Prejudice*
	- Charles Dickens: *Great Expectations*
	- J.R.R. Tolkien: *The Hobbit*
	- The Magna Carta (Historic Document)
- **Organizations/Projects**:
	- Royal Shakespeare Company
	- National Trust
	- British Library
- **Locations**:
	- London (Capital City)
	- Stratford-upon-Avon (Shakespeare's Birthplace)
	- Stonehenge (Prehistoric Monument)
	- The White Cliffs of Dover
- **Inspirations**:
	- British Parliamentary Democracy
	- The Elizabethan Era
	- Victorian Industrial Revolution
	- The English Countryside
	- Arthurian Legends
- **Concepts/Symbols**:
	- The Union Jack (Flag)
	- Big Ben (Clock Tower)
	- The Lion and Unicorn (Heraldic Symbols)
	- Common Law

Do NOT hard-code examples from the above list!
Make sure Inspirations doesn't include negative ideas.  
Make sure the Concepts/Symbols list doesn't include everyday notions or specific mundane things, and limit these to a most important list of dozen or so, in keeping with the messages of the content.  

Copilot 3.0 (out 260825) doesn’t recognize {activeNote} variable…?
Hold on updates till it gets sorted.

Also, in a later WebClipper version, as mentioned…

…your space separated tags values will be deemed illegal.
You need to add commas between tags values in your templates, for now.

If this is intended and it doesn’t get solved, all your tags values you added in the past need to be replaced using a regex replacement capable text editor.