Ctrl + left / right / backspace to recognize non-space breaks

Use case or problem

(1) breaking between phrases:

So, here’s a sample Chinese sentence: “这是一句没有任何空格的中文句子示例”

If you paste this sentence into Chrome searchbar then press ctrl + backspace, you’ll see that “示例” at the end of the sentence gets deleted, which is a phrase. This behaviour is consistent with pressing ctrl + backspace in English sentences.

If you copy the same sentence, then do the same ctrl + backspace in obsidian(v0.12.15), the entire sentence would get deleted, instead of a phrase.

The same thing applies to ctrl+left/right as well.

When using obsidian with some languages, etc. Chinese or Japanese, that doesn’t break phrases on space, the above problem will happen.

(2) breaking between languages:

The current obsidian does not break phrases between characters of different languages.

Press ctrl + backspace in sentence like “这是一句没有任何空格的中文句子Sample” with obsidian would delete the entire sentence, where only the word “Sample” should be deleted.

Proposed solution

Adding Chinese/ Japanese tokenization support.

Press ctrl + backspace at the end of a Chinese / Japanese sentence should delete the last phrase, instead of the entire sentence.

Libraries such as jieba(for Chinese) and Konoha (for Japanese) might be useful.

Current workaround (optional)

None. press backspace manually for multiple times to delete a phrase

Related feature requests (optional)

https://forum.obsidian.md/t/chinese-word-segmentation-should-also-be-supported-in-the-editor/13388

https://forum.obsidian.md/t/support-tibetan-language-dont-segment-using-whitespaces-behave-like-cjk/15145/15

3 Likes

yep, no one’s interested just like the other feature request I mentioned. I guess switching to something else is easier than implementing this

1 Like