Combining vaults and avoiding duplicates

Things I have tried

I have used Windows PowerRename to add a vault specific unique suffix to each file within each of the vaults. I also did a search and replace to add that vault specific suffix before each occurrence of ]]. By combining the vaults now, I can be sure that there won’t be any conflicts. But, the situation is not ideal.

What I’m trying to do

If there is a way to combine the notes from multiple vaults into a single folder, without having to individually deal with each duplicate and without having negative consequences and conflicts, I’d be very interested in hearing about it. I only say that it is important to use a single folder because I want to avoid having duplicate note names identified by their path. I have tried using that workflow but have run into strange results when changing note names.

I was going to make a feature request but figured that there is probably a separate tool or workflow that can make this process less painful and risky.

Edit: I just realized that my method causes problems for non .md files whose embeds show their extensions in Obsidian notes. It also causes issues where the link is not simply to a note, like when Liking to headings. Anyways, this realization only makes my help request more needed. Fortunately I made backups!

Thanks for any suggestions.

You could use Obsidian Sync to sync them to each other and rely on its automatic merging. But definitely test on copies of your vaults first (or copies of a few files from them) in case it doesn’t do what you want. Ideally, 2 exact duplicate files will become a single file. Partial duplicates will be merged as well as Sync can manage, but it might not arrange the text the way you would. If 2 completely different files have the same name it might append 1 to the other or might — data loss — replace 1 with the other.

Another option is to use a dedicated app for merging text files. The only one I know offhand is Meld for Linux (and I’m not even sure it does batch processing like you want), but I’m sure there are others. Git can merge files automatically, tho that’s not its main purpose. A GUI front-end for Git would make the process friendlier. Again, test on copies of your vaults first and check exact duplicates, partial duplicates, and totally different files with the same name.

1 Like

@CawlinTeffid Thanks for the thoughtful response! I just looked into using git offline and an happy to see it is possible although not as straightforward as I might hope. But, that is definitely something I hopefully may loop back to eventually.

I looked into text merging applications for Windows and found many such as one called WinMerge. I am definitely going to be careful and read up a bit more before proceeding.

As I look at my files, they are actually not as discombobulated as I pictured. But still it is going to be lots of work to do manually. I am seeing some functionality in certain applications that simply assist in the comparison as opposed to the automated way I was hoping for. It is definitely too much to ask to fully automate it unless it was built into Obsidian as a plugin or something, since the links have to be updated. Perhaps I will create a feature request after I see how things go with this process.

Again, I appreciate the response!

One thing that would be helpful that I am having trouble finding any info about is the ability to find identically named files within different sub folders.

For example, I’d like to be able to copy like 50 different vault folders into a single folder and have a tool be able to locate all the files with duplicates, without requiring me to point to them. I am sure I will find something. If anyone has any ideas, I would definitely very much appreciate it.

Edit: I am finding all sorts of options. I had no idea there was so much software available to do this type of thing. Still, I am a bit lost, but am happy to not have my search come up empty.

Thanks!

It may help to include “recursive” in your search terms, because you want it to search recursively (descending into subfolders). I can’t vouch for WinMerge, but it can apparently do this: Comparing and merging folders - WinMerge 2.16 Manual (it compares file contents, but will note if a file only exists on 1 or the other side of the comparison).

It’s probably possible to do it with a Powershell script, too, but I’ve never used it myself and a brief search suggests it might get a little complicated.

1 Like

@CawlinTeffid Thanks again! I was about to back down and try something different, but this looks very promising. Good stuff!

Although I seemed to respond above as if my problems were solved, unfortunately, after looking further into this and experimenting with it, I’ve found that Recursive Mode simply refers to having WinMerge also compare all sub folders. It doesn’t however recognize identical files as being duplicates unless they are in the same relative location to the compare folders.

I just wanted to add this so that my post above doesn’t mislead anyone. I am still looking for a solution to my issue.

Thanks.

You’ve probably already tried this, but have you searched something like windows find duplicate files? I seem to recall hearing about a Mac app that does that (and some other file-maintenance stuff), and there must be something like it for Windows. I didn’t explore results for that search but it looks like it’s a thing. Here’s a review of a few: https://www.howtogeek.com/200962/how-to-find-and-remove-duplicate-files-on-windows/. I don’t know of any of them let you match on just filename or if they only search for files that have the same content.

1 Like

Thanks. But, yes, I have tried. There is something called CCleaner which can find all duplicates across many folders. But, unfortunately it displays the duplicates on a case by case basis. So, unlike WinMerge where you have the tree view and could select ranges of duplicates and deal with them in one go, you’d have to do it duplicate by duplicate. It also doesn’t have the text comparing features that were helpful in this process.

In the end, I understand that this isn’t really an Obsidian specific issue. However I felt that this was important to have opened up this topic for comment before I create a feature request. Because the ideal solution would be one that manages this problem and also deals with correctly auto updating links during this process.

Thanks again for your help!