Ignore accents/diacritics in search

+1

This would be very useful indeed.

3 Likes

Steps to reproduce

  1. Create and save note with the text café наёмник (notice the letters with diacritics). Second word is mercenary in Russian, if anyone is interested.
  2. Do a search for cafe.
  3. Do a search for наемник.

Expected result

Search returns the text in the note for steps 2 and 3.

Actual result

Search returns nothing for steps 2 and 3, since doesn’t consider that the letters in question could be interchanged in certain cases. For Russian there’s only one case where letter е could be used instead of ё in certain cases, but I’m not sure what should be done for languages like French with tons of diacritics and occasional French-derived cases in English like café, naïve etc.

Environment

  • Operating system: Windows 10
  • Obsidian version: 0.8.1

Additional information

There’s a feature request concerning this exact behavior but I don’t think it’s a feature, rather it’s a bug.

1 Like

I think it’s a feature request. In some searches, you may want to ingnore diatrics other times you may want them.

Up to you, of course, but from what I can tell in the above-mentioned cases any English-speaking person would expect to have his “cafe” recognized and found by default. Same applies with a Russian searching for his “наемник”, this should be the default behavior.

They might want to ignore diacritics in certain cases, e.g. to find instances of undiacriticized (eh) words. Then it becomes an additional feature / option.

3 Likes

I agree that this is something important and we should do it. I agree that we may even change the default behaviour to ignore diatrics. I don’t consider it a bug.

8 Likes

I incorrectly filed a bug report about Different Unicode code points some weeks ago. Although it’s not Obsidian’s fault, and given the variety of reasons that may lead to these inconsistencies creeping in, I’d prefer it if it were more permissive by matching characters with and without diacritics.

2 Likes

Relatedly, I think it makes sense for the fuzzy search in the link suggestions to ignore diacritics.

5 Likes

+1
For those who mostly use non-English languages, this is an essential feature.

12 Likes

We need this.

Yep, here’s a case where I wish the order in autocomplete was better:

Screen Shot 2021-04-22 at 8.49.17 AM

4 Likes

Any chance that this might be implemented soon (viz. the default becoming diacritics ignored)?

Hi
In the search tool can I disable the sensitivity to this type of symbols that go over letters in some languages: ^¨~´` ?

I searched on help docs.

Thankyou

+1 Would be great for Vietnamese users :blush:

1 Like

+1. Would help working with texts in Pali (language used in the Buddhist scriptures, like the Pāli Canon)

+1 I need this desperately.

This would be extremely useful for spanish speakers, not only for Search, but also for Quick Switcher and Internal Links :pray:

Does anyone knows why despite this is so important for many non-english speaking users, is so difficult to accomplish / or ignored ?

3 Likes

+1

Me and others non-English speakers would love this feature. I think that may be reasons to keep diacritics for some one. Why not include a switch bottom on settings with this option?

1 Like

+1! I need this behaviour as well.

1 Like

Use case: Searching for a note with “à” in its name, can’t use “a” in the query

If I have a note that uses special characters such as accents, Obsidian treats these as entirely different characters. This makes it hard to find such notes with the search function if you don’t know the precise spelling. For example:

If you have a note titled “ànima”, you can’t use the query “anima” to find it. This can become even more complex if there are multiple of such characters in the name.

Proposed solution

To have a toggle switch next to the search bar (maybe it can be en/disabled in preferences for those who do/n’t need it) that lets you treat special characters the same as their “base” version. So typing a in search would also match à, á, â, ä, æ, ã, å, ā, etc.

One simple way to get a partial list of such variations on macOS is to hold down the key and wait for the popup to show up with the possible variations. For example if you hold the a key for a second, a popup shows up with 8 alternate versions of it.

I suppose in the preferences there could be a list of all the characters that you want to associate with the base character. Because this problem can multiply in complexity as you add languages and character sets, making it user customisable might alleviate the implementation burden. So for a you’d have a text-field in which you can add other variations.

Current workaround (optional)

All I can think of now is to not use such characters in the filenames, but this is far from useful.

7 Likes