Large publish vault speed and indexing

Hi! I maintain a large-ish academic-purpose published vault https://tl2.io (10K pages, 200 Mb, content is 99.9% text), and have slow first load and problems with google indexing. I fed the sitemap to google console, but most of the pages - including the homepage - remain under soft 404 after a couple of months. I do not know if speed and indexing issues are connected, and would be happy to hear any insight on how to improve things.

Google console indexing:

Cloudflare speed test:

I don’t notice anything out of the ordinary in your website.
The first visit to a page may be slower if it involves a CDN-miss. (Content Delivery Network).

Regarding google, I don’t know.
To properly crawl Obsidian Publish’s websites, the crawler needs to “run” the website. Google’s crawlers are able to do so.

1 Like

Well, I can certainly live with the current first slow load, but the lack of indexing is a bit worrying - this text is not present in clean form anywhere else in the internet, so it would be nice if search engines could bring it to the user… I wonder if anyone else experiences problems with soft 404?

Yes! I have an issue with slow first page load on Publish (but we’ll survive), and soft 404 in Goggle Search Console. My website is 35k files, mainly text on the pages. The first slow page load is due to Obsidian serving a complete index file of all your pages including their abstracts (description) as well. So that sets us back ~11MB currently, and often ~10 seconds load or more. I am trying to speed things up with Argo on Cloudflare, but I think Argo works more on the CDN content I have put in R2, which I anyways don’t need to speed up. I’ll investigate this before the bills run too high. Please take a look at Prabhupada.io to compare speeds, and if you like some of the functionality, feel free to steal. The Ai coder won’t mind.

If the Google search indexing issue is not fixed, and anyways Obsidian doesn’t have good SEO with custom titles, I will look at creating a static version with Quartz, publish to Cloudflare pages, and find a workaround for search. My site will only continue to expand, so it is not like I can trim it down or anything. Either Obsidian needs to support large sites, or we will go static!

Hope it works out for you (and me as well). We need our pages indexed. I also issued a support ticket to the Google Search Console “people,” but who knows when or if at all they will reply. And that is assuming that anyone actually reads those support tickets, and it doesn’t all just go into a large burning coal chamber with the evil bots laughing at our enslavement… Well.