I think I understand a little bit about what you concerned about.
For keyword, in Obsidian, you can use [[xxx]] to “create” keyword “xxx”, and the occurrence of this keyword can be tracked in the backlink panel. You may find more information about this Backlinks in the Obsidian’s Help vault.
For alphabet issue you mentioned, I think it is not the whole picture. For language as Chinese, the finest granularity (or meta component) is the character, the word and the sentence is composed of one or more than one characters and there is no space between the characters. To support the search of text written by English as well as the language other than English, the Regex is one of the universal solutions.
For example, the sentence of ”今天天很蓝“ has 5 characters but 4 words "今天 天 很 蓝” ==>Today, the sky, (is) very, Blue. As a matter of fact, the tokenization of Chinese is a complicated task in NLP.
The word-wise search is meaningful, but not simple enough.