Six months ago, I knew next to nothing myself.
I had 14000 pages of Word docs (on Windows) with various formatting (mainly bold text that came in handy to be changed to Wikilinks as they were basically prospective titles made for Obsidian) that needed to be converted to Markdown.
For me, Pandoc didn’t work out. My Word files were simply too badly formatted with lots of colours, bold and italics, indentations (sometimes 3-4 deep) etc. What’s worse, I was using Pages on iOS as well so everything was in shambles. Mostly the start and end points of bold text was shot to hell. Also, Pandoc didn’t not extract my images (the program was overwriting my image1, 2, etc. files and I couldn’t find a proper batch/Powershell code to help me).
So I went this way (my case is special because of the large volume of text):
I was looking around on various forums and getting help from admins (mostly retired Aussies) on how to make my VBA macros.
Mostly I figured everything out for myself as I went along.
I used macros to change my makeshift titles to Heading 1 and with that I created section breaks to break up each title to come to a new page. Then I saved all titles with the Heading 1’s as titlenames.
I needed to batch convert my docx files to txt files (to be easily renamed to md files later).
On one forum I received and tweaked a macro that identifies inline shapes (embedded pics) and changes the inner reference to Obsidian references (e.g.
C:/...User/...Obsidian/Vaultname/A-Z folders/assets/activedocumentname_image1, 2, 3, etc.).
I was using another macro that extracted from each docx file all uncompressed pics (with the HTML conversion, only compressed pics are extracted).
In the main overhaul macro I changed bold text to Wikilinks, orange coloured texts to highlights and footnotes to endnotes, etc.
After doing all that I started learning some regexes and cleaning up my documents.
In Obsidian I used this plugin to find all broken links. More manual work…
If you ask the right questions on forums, you’ll get there one task at a time. If you have 500-1000 pages worth of Docs, I’d probably advise you to get down to it and do most things manually.
I remembered that one forum thread I was asking about on. If you follow my nick on that forum and other forums, you might be able to track down some of the steps.
This is no easy way. It was my way. I learnt the hard way. But as somebody else pointed out or implied, if you are hopeless on PC, you might need to buy someone’s help at your location.