What I’m trying to do
I’d begun writing with Scrivener on a licence bought from Apple’s App Store, pre-“macOS”/ Mac OS X era on a now defunct Apple ID. Hence, no more export option. Hindsight, and all that, but live and learn, I guess.
Really not trying to reinvent the wheel by any means, just thought I’d fan up a smoke signal to see if anyone’s faced a similar predicament where trying to transfer an old project into Obsidian was impeded without the in-house export tools from the emigrating software. I’ve never really had to consider the quickest, cleanest way of sifting through a text editor’s archive folder structure for its projects before. And I’ve no pressing desire to rely on a third-party, canned, converter solution, whether free or paid.
So, I would be willing to engage with most any brainstorming you feel fit to contribute! It seemed like a unique enough issue to warrant a thread, and I’m using this as a sounding board for my own “solution”-shooting as anything else.
Things I have tried
Good to know about Scrivener:
Scrivener projects are stored in a single folder, internally called a binder and externally denoted with a filename.scriv extension, but this is just for identifying it; it’s merely a folder.
In any given binder, as best as I can tell, there is:
-
Files, the folder with the actual meat and potatoes of the text written by the user
-
QuickLook, the folder that has, seemingly, visual data formatted in XML/mini-screen caps of a binder’s subfolders that Scrivener’s interface organizes in a vertical ribbon on the left side of its active window as a preview. Likely unnecessary to fiddle with or otherwise include.
-
Settings, the folder containing a series of .xml files that house the Scrivener editor’s last used/saved global preferences while in use, as well as preferences unique to the project. This includes a recent edits table, also in XML. As with QuickLook, probably unneeded.
-
filename.scrivx, a file with the same name as the binder, where the .scrivx extension denotes the “master” XML file that houses every last bit of metadata you could hope to wonder about the project, including timestamped, nested edit history, in all its less-than/greater-than tag-filled glory.
So all that really needs dealing with is Files. Within Files, there are a couple files for search history, backups, font/style settings, and the main folder of contention: Data
Within Data, there are about 60 folders of varying sizes, and best I can tell, they are a combination of Scrivener templates/on-hand “demos” of various formattings, exporting options, manuscript mode, etc., and the actual text contributed to the binder by the user. So, the amount of folders would vary with the size of any given binder.
All folders in Data have 32-character names. It’s probably 32 bits of hexadecimal, where the first 8 bits denote something to Scrivener, the next 4 some other trait, as with the next 4, and so on the next 4, and the next 12, like so: 0F9BE578-43CC-495A-94B4-05F65CBA8222
There’s also a docs.checksum file in Data, presumably used during opening/saving through the cloud, if you were using a different device than whatever was last synced with Literature & Latte’s server(s) (Scrivener’s developer).
In each hex folder, there is never more than a single file, either content.rtf, or content.pdf
The PDFs are the templatized text files Scrivener offers for quick formatting, and the RTFs are the content contributed by the user.
So! Short story long: I’m thinking some kind of PowerShell or Python script that could take the whole Data folder as an argument and recursively go through it, remove all RTF formatting, escape characters, font info, line returns, etc., change the extension to .md, and be done with it.
I wouldn’t say I’m a good programmer, per se, but I’d rather spend time (even a long time) trying to do this efficiently than just retyping the text verbatim into new markdown files, copy-pasting, or any otherwise far more “manual” text editing task, especially when it’s just to smooth the process of importing into, well, another text editor.
Again, I know there are myriad converters online, web-based or otherwise, I’d just rather not use them and do it locally. It’s a lot of text. That, and I’m a sook when it comes to privacy, though I’m well aware the best solution to keeping your data private is to never connect to the internet at all. I digress.
But thanks in advance! If any discussion in this vein might help anyone else trying to implement or otherwise puzzle out a code-based solution on their own, all the better.
PS
No, I don’t particularly feel like querying or prompting AI a hundred times, out of principle, more than anything.