Help needed to switch from Roam

What I’m trying to do

First of all hello to my fellow Obsidians! I have a specific problem and hope to finde some help here:

I’ve been a Roam-User for some years. I’ve got quite a huge Graph, which contains a lot of literature reviews, written papers, drafts for papers, etc. I’ve been wanting to switch to Obsidian for a while because of the more open concept of Obsidian (.md files) and the more active development. It is important to me to keep all the information, backlinks, references. etc.
I have a specific problem with the migration though: In Roam, I’ve made extensive use of naming page titles (of literature reviews) like this: [[Author 1/Author2: Title of publication (Year of Publication]].

In Roam this causes no problem, in Obsidian one cannot use Slashes (/) and colons ( : ) in page names. This causes a problem when importing my Roam-Graph (exported as a JSON-file) into Obsidian using the Obsidian-Importer-Plugin. Every page of my Roam-Graph with slashes and colons in the tiltle (those are a lot of pages of my graph) is imported incorrectly, creating a folder with a note inside of it instead of importing the whole page as a note. I’ve already created a bug report regarding this problem, but I’m not sure if it will get fixed.

In summary: I need a way to replace all of the slashes and colons in the page titles of my Roam-Graph.

Things I have tried

I’m not skilled in scripting and programming, but tried a few things. My idea was to crate a Python-Script with Regex to replace all the slashes with commas and all of the colons with a dash. If I understand the structure of the exported Roam-Gaph correctly, I need to catch all of the colons and slashes inside of double brackets ([[ … ]]) AND inside of strings like this: “title”:“…”

I came up with the following regular expressions

  • (?<=[[.):(?=.]]) → for colons inside of double brackets
  • ?<=[[.)/(?=.]]) → for slashes inside of double brackets
  • (?<="title":".+?):(?=.*",) → for colons inside of the “title”:“…” string
  • (?<="title":".+?)/(?=.*",) → for slashes inside of the “title”:“…” string

I’ve tried to create a python script to use those in my exported Roam-Graph, but it doesn’t work:

import json
import regex as re

with open(“Roam.json”, “r”) as f:
data = json.load(f)

replace_regex1 = re.compile(“(?<=[[.):(?=.]])”)
data = replace_regex1.sub(“-”, data)

replace_regex2 = re.compile(“(?<=[[.)/(?=.]])”)
data = replace_regex2.sub(“,”, data)

replace_regex3 = re.compile(“(?<="title":".+?):(?=.*",)”)
data = replace_regex3.sub(“-”, data)

replace_regex4 = re.compile(“(?<="title":".+?)/(?=.*",)”)
data = replace_regex4.sub(“-”, data)

with open(“Roam_updated.json”, “w”) as f:
json.dump(data, f)

When running the script, I get this output:

line 12, in
data = replace_regex1.sub(“-”, data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected string or buffer

Can somebody with some scripting or programming experience help me with this? Did someone switching from Roam encounter this problem and has some solution to offer?

Thanks in advance!

Setting aside the regular expressions themselves for the moment, you could eliminate one source of complexity by using a text editor that can do regex search-replace across files, like VS Code or BBEdit.

1 Like

If I’m not mistaken then json.load(f) earlier on returns the complete object of that file. So given a simplistic case of { "a": 1, "b": 2 }, your data object would now have the values data.a = 1 and data.b = 2. This is not something the regex can handle. It needs a string or a buffer of text to replace stuff in.

To handle this, one way would be to read the entire file as text and do your replace operations, and then do the json-transformation on that file, or you could address each specific field of your object structure in need of this transformation. Just guessing, but maybe something like data.filename or data.title instead of data in the regex functions (assumming it’s only one field you need to do this operation on).

Most likely, though, you need to do the replace on the entire file since I reckon your links could be anywhere within the file…

1 Like

Thanks, good idea. I’ve just tried it with VS Code, but it doesn’t accept my Regex. I think there’s a problem with the lookaheads/lookbehinds in cobination with infinite quantifieres…works with the regex module in python though.

Thank you, this was a hint in the right direction. I’ve tried it with reading the whole JSON as a string, which worked - but somehow corrupted the JSON (the Obsidian Importer doesn’t accept it). I guess it wasn’t specific enough, catching not only colons and slashes in page titles, but somewhere else.

So in the end the python script now worked, but I still don’t have a file ready to import. I guess I either have to look in the structure of the Roam-JSON and make adjust the script (to only work on the objects containing page titles) or come up with a more precise REGEX or hope the importer plugin gets updated so that it catches invalid Page-Titles on import…

If someone has another suggestion or had the same problem and came up with a solution I’m thankful for input!

VS Code probably uses JavaScript regexes, which may differ. Rather than using lookahead and such you could try replacing “(double brackets followed by any number of not double brackets) followed by (slash or colon)”

(\[\[[^(\]\])]+)(\/|:)

with “the first capture group (that is, everything but the slash) followed by a dash”

$1-
1 Like

Small Update: I’ve now fixed the page titles using the find&replace plugin for Roam, which was quicker than getting some Script to work (for me as a person not too experienced with regex and python).
For future Roam-users wanting to switch it would be great if the importer plugin would automatically replace invalid characters in titles though…

You can post a feature request for that in the Issues section of the Importer plugin’s GitHub.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.