Hi all,

I had accumulated years of notes, and when I started to use Obsidian in order to organize them I felt overwhelmed. So I created an a small script to automatically connect ideas which are similar to each other.

Essentially I placed all of my notes into one file, and each line become it’s own note. Then I calculated how similar each note is to the others and I took the top 3 most similar note and made a back link to them.

Here is the script:

``````import textdistance as td
import random as rand

class Note:
def __init__(self, text, uuid):
self.text = text
self.uuid = uuid
self.dist = []
self.current_dist = 0

def __lt__(self, other):
return self.current_dist < other.current_dist

def main():
note_file = open("notes.md")
notes = []
for line in note_file:
n = Note(line, rand.randint(100000, 1000000))
notes.append(n)

x = 0
for current in notes:
current.current_dist = 1
for other in notes:
if current.uuid == other.uuid:
continue
h = td.monge_elkan.normalized_distance(current.text, other.text)
other.current_dist = h

current_notes = sorted(notes)

for other in current_notes[:3]:

x += 1
print(x)

for n in notes:
f = open(f"monge/m{n.uuid}.md", "w")
f.write(n.text)

main()
``````
1 Like

Interesting. I have some questions:

• You state “Essentially I placed all of my notes into one file, and each line become it’s own note.” What do you mean by “one file”: one folder, like a vault?
• How does “each line become its own note”?
• You state “Then I calculated how similar each note is to the others”. On what basis the the similarity calculated? I cannot determine that from the script.

Hi Klass.

A) I record all of my important ideas and thoughts in one dropbox file name notes.md

this is so I can quickly write notes by appending to that file via my mobile phone instead of creating a new file (it’s a pain to do that using the dropbox notes app)

I usually think of one sentence ideas like “The source of my perfectionism is …” and I just add that to the end of note.md file

B) Each line is theoretically in my mind an independent note/idea from the lines around it. They do not have to necessarily be related to the lines directly above or below it.

I just write the idea as them come. So if I have 1000 lines in notes.md I know I have thought of 1000 different ideas at different items.

Now Obsidian recognizes notes.md as one note, even though to me they are 1000 different notes. So the way I go around that is:

1. I iterate through notes.md and take every line
2. create a new .md file for that line,
3. name that .md file according to it’s line number (line 5 gets name 5.md)

The 1000 lines of notes in notes.md are now 1000 diffrent .md files with random file names. And obsidian now recognizes each .md file as a separate note that I can link to.

C) There is an algorithem called Monge Elkan

It compares text A and text B in two ways, first it tries to determine how many words they share in common (it even corrects for misspelling) and second it tries to determine if the meaning of the two text. If the text are similar, the two notes get a high similarity score from 0 to 1 (i.e. 0.87)

Here is how the algorithm works:

1. take note number 1.md
2. compare it to notes 2.md through 1000.md
3. take the three highest monge elkan score (let’s say note 5.md)
4. create a link in note 1 by adding [[5.md]] to that note
5. repeat the steps for the next note (2.md)
1 Like

Wow, that’s impressive. It is a very specific, specialized workflow.
Thanks for the explanation.

Can this be used to link other files? For example, instead of having to click “link” in every file when I import my rather large note database, I can just run a script to search the whole for references equal to filenames and then automatically create a link?

For example:
monument.md
(run script to search for “monument” in the entire database)
(replace all “monument” entries with “[[monument]]”)