Find similar notes (Python script)

Hey all,

So I had a first pass at what the behaviour for such a plugin could be with some user stories.
Let me know what you think.

Obsidian Python Plugin User Stories

MVP

Python script using a single note

As a user
When i’m working on a note
I want to be able to run python scripts that will use that note as input
So that I see and potentially paste the output based on the title, text and metadata of that note

Passing variables to Python script

As a user
I would like the following variables passed to an activated python script while working on a note: filename, filepath, contents of the note, date created, date modified
So that I can easily use these variables in my script

Python script output

As a user
After I run a python script and get an output
I would like the output to be pasted into my active note
So that I can easily see the output or enhance my current active note

Python script management

As a user
I want to be able to add or remove python scripts used in my Obsidian vault
So that I can easily manage which scripts are available to use

Python script activation

As a user
I want to be able to trigger a python script using a keyword shortcut or button while working on an active note
So that I can easily see or paste the output of the python script without having the leave my note

V1

Select Python Version

As a user
I would like to configure which python version installed on my machine my script will use
So that I can easily experiment in the same environment using terminal or VSCode without having to use Obsidian all the time

Custom variables for Python Script

As a user
I would like to be able to pass custom variables to a script when it is run
So that I can change how I want my script to behave e.g. showDetails=False


I think this would be a great place to start.

Here you go!

The “recipes” folder contains all the rule sets that I’ve used so far. Some are from my time with Bear, but may retain some usefulness. Others are more specific and experimental. You can create or delete as many recipes as you want. They’re simply .txt files.

  • Each recipe is a set of “find and replace” pairs that use regex.
  • Inside recipes, each line is a transformation, with the first half being the “find” expression, a separator string (-> surrounded by a space on each side) and then the “replace” expression.
  • Some expressions are commented (comments are whatever you write between a # and the end of the line). This feature was work in progress and may need to be redone or removed.

Let me know if you have any questions or comments.

Cheers!

1 Like

This looks great!

There are 4 output behaviours that I can think of right now:

  1. Replace the whole text
  2. Replace only the text that was selected upon running the script
  3. Insert the script output at cursor position.
  4. View only, without modifying the note.

Would this be set by the plugin or an option for each script?

I really like how this automatically pulls an index from a list of scripts which is then easily run.
I can see why something of python scripts could be so useful

I had not considered these options actually.
The ability to specify in the plugin how you would like the output to be applied could be really interesting.
For the MVP I thought just appending the text at cursor point would be a good start.
But I especially like:

  1. Replace the whole text ← Would like to head a use case for this one
  2. Replace only the text that was selected upon running the script ← this could be really cool, not sure this could be expanded to only sending the selected text to the script as input
1 Like

You might be able to embed your script into a plugin by using Pyodide — Version 0.17.0dev0 it appears to support nltk out of the box.

1 Like

I wrote a proof of concept plugin to invoke experiments.

I only implemented a view called ‘result-list’ for similarity

To use it I configure the scripts in a JSON file:

In this case, there are two ‘similarity’ algorithms

  1. dummy python script that returns random notes (obsidian-lab/randomScore.py at master · cristianvasquez/obsidian-lab · GitHub)
  2. dummy javascript script that returns random notes

The results appear in a pane, with a score that appears at hover.

This is just a POC, I’m still trying to figure how to simplify this.

2 Likes

Thanks for the stories!

What do you think of:

Inputs:
- The script id (mandatory).
- The vault path (mandatory).
- The active note name (optional).
- The current text selection (optional).
- The frontmatter data (optional. Contains date modified, tags etc).

Output options:
- A list → GUI:[Appears in a pane, as a clickable menu].
- An object → GUI:[Output to be shown in pane].
- A text → GUI:[Output to be pasted into my active note].

And an operation to ‘reindex’.

After experimenting a little with my scripts with the POC, I’m not so happy :/. I think I’ll try an implementation with a little HTTP-service to see how it goes.

Hey @cristian ,

Sorry for the delay.
I downloaded this and played with it a few days ago.

This is quite ingenious.
I had never considered running the python script on flask and invoking it via a localhost.
You have certainly given me more to think about.
It’s great to get a proof of concept up and running but is a bit of setup work.
I wonder if @trashhalo suggested Pyodide could ease the process.

Equally, I had not considered the side panel as a source of output - so many more possibilities.

In response to your points:

I agree with all these

Again, agree with all these.
Specifically of interest in a text to me.


I want to experiment more with this but a source of potential inspiration could be the plugin Templater especially if we are planning to use Flask and localhost to invoke scripts.

Really amazing progress.
I’ll continue to think about this.

How do you imagine using a template approach?

Today I updated the plugin, so it works adding scripts in a directory :smiley:

@cristian I can see you have been really busy with this recently!

I downloaded the latest version today and I think separating out the server from the plugin is a great idea.

I got one of my own scripts (TF-IDF similarity) and running this evening and it didn’t take long at all.

Looking at what you have now I guess the closest thing would be a nice visual interface to set the configurations of new endpoints.
But I would say this is “a nice to have” rather than essential.

I feel like this covers most of the requirements I had in mind.
Working on instructions to make it even easier for people to setup could be handy, but you have already done a pretty great job here.

Some things stood out:

  1. Probably should rename the random.py name because it causes issues with using ‘import random’
  2. Some of my scripts can be a bit process heavy and take some time. It could be cool to show some kind of loading indicator within obsidian.
    something like:
    As a user
    When I run a script
    While it’s running I would some indication it’s being processed
    So that I know its running in the background

This could be achieved by pasting some text in the note e.g. “Processing” which is then replaced or removed when the task is completed.

I’m going to play with this more but i’m just so impressed by what you have done so far.
I’m trying to think how we get this plugin out to a bigger audience.

For those who got stuck on the broken link above you can get it here - GitHub - cristianvasquez/obsidian-lab

1 Like

Ohh I’m happy you tried it :slight_smile:

What is misses for me is the ability to cluster notes (that was my initial motivation), but I still don’t know how to interact with the graph… (probably that’s for another thread)

Thanks!, will look at that

I bet the scripts take most of the time while reading the all notes. Perhaps having different phases, to ‘index’ (few times) and ‘execute’ (many times) can diminish the latency greatly.

I think that having a set of exciting scripts could attract people :wink:

I feel topic modelling can help here.
I think it would be interesting to have like 15 main topics for all my notes and create a structured note for each topic with a list of the documents within it

Fair, but even with optimisation it would be nice to have some kind of indicator to show the script is running rather than checking the terminal

Also very true.
So next step?

  • Create a directory of example python scripts people can use with your awesome plugin
1 Like

Hi @macedotavares , @raudaschl I’ve published the plugin: New plugin: Obsidian Python lab I would love to have some feedback, or cool scripts

2 Likes

Once I learn the ropes, I’m gonna conquer the world with this thing.

1 Like

Decided to play around with that idea.

Found GitHub - xupit3r/wuzzy: Simularity identification in JS js library. Seems interesting.

Thanks for sharing algorithms and such. Wasn’t aware of them.

1 Like

I was able to take the basic ocr template and adapt it to run your similarity.py script! I’m integrating it into a starter vault I’m making for my students; details for what i did here

3 Likes

“I did here” link is not working

sorry, haven’t been on the forum for a while. this link:
https://github.com/shawngraham/obsidian-student-starter-vault/blob/main/0%20getting%20started/Configuring%20the%20related%20notes%20template.md

2 Likes
1 Like