Better word boundary implementation (better double clicking)

Hi, I am a Japanese user of Obsidian.
This is a proposal to enhance Obsidian’s double-click behavior, i.e., word boundary boundary detection.

Use case or problem

The premise is that Japanese, unlike English and other languages, does not use spaces for word boundaries (this includes Chinese, Korean, etc.). In addition, as is customary in Japanese, English words are sometimes interspersed with Japanese words, which also do not use spaces at the boundaries between words.


(He told the BBC that if he had succeeded in landing on the moon, it would have been a “sea change” in commercial involvement in space exploration.)
from 日本企業の月着陸船、月面に衝突か ispace - BBCニュース

In the current implementation of Obsidian, the word boundary detection algorithm used for double-click word selection seems to look for the most recent space or punctuation mark. Unfortunately, this behavior is difficult to use for Japanese users, since the range selected by this behavior covers almost all sentences, and it does not work correctly even when mixed with English words.

Proposed solution

On the other hand, some well-known browsers seem to have excellent algorithms for detecting word boundaries. Fortunately, this algorithm is independent as a library, and I believe it can be used in Obsidian. I am not familiar with Obsidian’s implementation language, but if your implementation language is C/C++/Java, you can use ICU’s library directly. If your implementation language is JavaScript, you can use the ICU library indirectly by using Intl.Segmenter .

Current workaround (optional)

There is no workaround for this.

Related feature requests (optional)

1 Like