Ignore accents/diacritics in search

My native language includes diacritics / accents (see https://en.wikipedia.org/wiki/Diacritic). Would it be possible to make it so that search treats characters with diacritics as if they didn’t have them?

For example, the word maximum is written as “máximo”. Sometimes I’m careful and I write correctly. Sometimes in a rush, I might be more careless and write “maximo”. In any case, it would be a reasonable expectation that when I search for either of the terms, both results would appear. This is also common in other applications I use (e.g. recoll).

Thank you for this great software.

103 Likes

+1

This would be very useful indeed.

3 Likes

Steps to reproduce

  1. Create and save note with the text café наёмник (notice the letters with diacritics). Second word is mercenary in Russian, if anyone is interested.
  2. Do a search for cafe.
  3. Do a search for наемник.

Expected result

Search returns the text in the note for steps 2 and 3.

Actual result

Search returns nothing for steps 2 and 3, since doesn’t consider that the letters in question could be interchanged in certain cases. For Russian there’s only one case where letter е could be used instead of ё in certain cases, but I’m not sure what should be done for languages like French with tons of diacritics and occasional French-derived cases in English like café, naïve etc.

Environment

  • Operating system: Windows 10
  • Obsidian version: 0.8.1

Additional information

There’s a feature request concerning this exact behavior but I don’t think it’s a feature, rather it’s a bug.

1 Like

I think it’s a feature request. In some searches, you may want to ingnore diatrics other times you may want them.

Up to you, of course, but from what I can tell in the above-mentioned cases any English-speaking person would expect to have his “cafe” recognized and found by default. Same applies with a Russian searching for his “наемник”, this should be the default behavior.

They might want to ignore diacritics in certain cases, e.g. to find instances of undiacriticized (eh) words. Then it becomes an additional feature / option.

3 Likes

I agree that this is something important and we should do it. I agree that we may even change the default behaviour to ignore diatrics. I don’t consider it a bug.

8 Likes

I incorrectly filed a bug report about Different Unicode code points some weeks ago. Although it’s not Obsidian’s fault, and given the variety of reasons that may lead to these inconsistencies creeping in, I’d prefer it if it were more permissive by matching characters with and without diacritics.

2 Likes

Relatedly, I think it makes sense for the fuzzy search in the link suggestions to ignore diacritics.

5 Likes

+1
For those who mostly use non-English languages, this is an essential feature.

12 Likes

We need this.

Yep, here’s a case where I wish the order in autocomplete was better:

Screen Shot 2021-04-22 at 8.49.17 AM

4 Likes

Any chance that this might be implemented soon (viz. the default becoming diacritics ignored)?

Hi
In the search tool can I disable the sensitivity to this type of symbols that go over letters in some languages: ^¨~´` ?

I searched on help docs.

Thankyou

+1 Would be great for Vietnamese users :blush:

1 Like

+1. Would help working with texts in Pali (language used in the Buddhist scriptures, like the Pāli Canon)

+1 I need this desperately.

This would be extremely useful for spanish speakers, not only for Search, but also for Quick Switcher and Internal Links :pray:

Does anyone knows why despite this is so important for many non-english speaking users, is so difficult to accomplish / or ignored ?

3 Likes

+1

Me and others non-English speakers would love this feature. I think that may be reasons to keep diacritics for some one. Why not include a switch bottom on settings with this option?

1 Like

+1! I need this behaviour as well.

1 Like