Search feature doesn't distinguish between white space and non-breakable white space

Steps to reproduce

  • In a note, write a few instances of " ?" (standard white space / sp) and " ?" (non-breakable white space / nbsp)
  • Open the Search+Replace tool (or CTRL H) and type " ?" (sp)
  • Notice that all instances of " ?" (sp) and " ?" (nbsp) are flagged

Did you follow the troubleshooting guide? [Y/N]

Y

Expected result

Searching for a white space should not flag non-breakable white space and vice versa.

Actual result

Searching for a white space, the engine also flags NBSP and vice versa.

Environment

SYSTEM INFO:
Obsidian version: v1.5.11
Installer version: v1.4.16
Operating system: Windows 10 Pro 10.0.19045
Login status: logged in
Catalyst license: none
Insider build toggle: off
Live preview: on
Base theme: adapt to system
Community theme: none
Snippets enabled: 0
Restricted mode: off
Plugins installed: 2
Plugins enabled: 2
1: Kanban v1.5.3
2: Dictionary v2.22.0

RECOMMENDATIONS:
Community plugins: for bugs, please first try updating all your plugins to latest. If still not fixed, please try to make the issue happen in the Sandbox Vault or disable community plugins.


Additional information

This is important for formatting text properly in French, especially when dealing with large amounts of text. Currently, Obsidian doesn’t allow checking whether all punctuation (? ! : ; ) and numbers have a NBSP before them or not.

Edit: unintentional emoji instead of punctuation

I’d like the option or setting for searches to either abstract over different kinds/encodings of whitespace, or to not. I’m in the opposite boat to you, @Paranofrecks, since I want to be able to search for “my string” without caring what kind of whitespace separates “my” and “string”. A lot of the text I copy from the web uses non-breaking white space (U+00A0), and I found today that globally searching for “my string” using normally typed space encoding (U+0020), doesn’t return the non-breaking white space instances.

Also worth noting there are two kinds of search involved here. One is the in-note ctrl+f, which seems to abstract (correctly, for my use case) over different kinds of whitespace. The other is the global Search core plugin that searches in all files, which doesn’t abstract over different kinds of whitespace. So it’d be nice if both of these types of searches had settings to turn on/off the abstraction.

I don’t see a bug here. A couple of remarks here.

Regarding single file search, there is not concept of whole word or even exact match search. There is an open FR for that. https://forum.obsidian.md/t/improve-single-file-find-search-by-adding-selection-casing-whole-word-regexp/29660

Regarding global search, "this text" is used for exact matching.

@WhiteNoise , so if I global search for “my string”, and there are instances of “my string” in my vault, but using different kinds of spaces, and as a result I don’t see all instances of “my string”, that’s not a bug? I would expect Obsidian to abstract over all kinds of spaces, regardless of their encoding (which, again, is how the within-note ctrl+f is working currently). Or, have an option/setting somewhere for whether to care about the specific encoding or not.

In my case, it seems Obsidian is adding the nbsp when pasting certain things from the web that get formatted, like hyperlinks, but other things as well. I’ve tried various community plugins in an effort to prevent this pasting behavior, but no luck.

For now I’m just using a plugin that makes ctrl+v always behave like ctrl+shift+v (no formatting), but it would be nice to have some formatting preserved when pasting, without it adding extra exotic whitespace like nbsp.