When I search my vault for software should be, I get a lot of results for “best,” “been,” “be,” “better,” etc. This is obviously nothing like what I was looking for. I have a note with that exact text in it, but I can’t get it to come up (without scrolling through who knows how many results, I guess) without putting quotes around the search term.
Proposed solution
Default AND search. Google was the first to do this in Internet searches, leading to it replacing all previous search engines. If you want OR searches too, fine, I guess, but those should appear below exact results—and even then, shouldn’t pages with more hits appear first?
It already does (unless something is broken). But the AND applies to whole notes if not otherwise specified, and by default the results show a preview of each match in the note. Unless there’s a bug, I’d guess that the notes with “been”, etc. in them also contain “software” and “should” (search results are grouped by file, so it should be easy to check). There are ways to force the words to all be in the same section or block, but they require more typing than just adding quotes. You can also wrap “be” in quotes to force an exact match.
You’re saying this would be your preference as default, which is fair. But you can already do all these things with the syntax provided. For now I’m moving this to Help. Changing the defaults of how search works isn’t really going to be considered.
There is another feature request somewhere about better priority of results which is a different story. I suggest searching for it, and adding your input there., because yeah there could be room for improvement perhaps.
Otherwise, this thread is open for suggestions for how to better format your search.
It’s about the priority and purpose of space in search criteria. In natural language it serves as a word adjacency operator (as can be seen in this sentence) whereas starting with Alta Vista and through Yahoo!, Ask Jeeves, and now either Google or DuckDuckGo it has been downgraded to an implicit OR operator. Those of us who work(ed) in text retrieval and good search systems the natural language version is preferable and the best choice.
That request was a little buried — I overlooked it myself (the sentence primarily asks for it as default). It might be worth making a separate request for it, as this one has gotten a bit muddled.
I’m embarrassed. You’re right. I just had a coincidence that the patterns I tested were the most recent files, and I sort by modified time. Sorry folks!
EDITED MY POST: I think I might be wrong. After more testing, I think I get what you were originally saying @Calion apologies for not getting it right away. I was too focused on the syntax.
But now I see that it IS nearly impossible to search for a note that contains 3 separate words, and only if it contains those 3 words. Quotes only works if the phrase is contiguous. That isn’t about relevancy. It’s difficult just to filter out things.
I’ve been testing Omnisearch to see if it helps, but for example, if I search “rare” and other terms, I get a lot of results for “are”. But the description says it has smarter weighting. So I’m going to spend some time experimenting with the plugin’s settings. It has a lot of options. There may be other search plugins too.
It baffled me before Google why this wasn’t the default, and it baffles me now that it is no longer the default. I understand some fuzziness for misspellings, different word orders, and even the possibility of not having every word in the results, but I completely fail to understand the thinking behind using space as an implicit OR. Does anyone like the results that gives?
@Calion Good idea! So, to be clear, are you suggesting that the results follow the following order?
Notes that contain exact matches (if they exist)
Notes that contain each and every word in the search phrase (using the implicit AND)
Notes that contain some or even just one of the words in the search phrase (using the implicit OR)
Additionally, I was wondering if it might be helpful to add another set between 1 and 2. These could be notes that contain all search words (using the implicit AND) that also contain partial exact phrase matches.
Of course, once we start imagining the perfect search algorithm, many rabbit holes open up. But, in a software such as Obsidian that doesn’t impose the need for strict organization via folders, the search should be our best friend. I would find the partial phrase match in conjunction with the suggested order above especially useful during times where I am querying some phrase I remember writing but cannot recall it exactly.
Come to think of it, maybe another set between 3 and 3 could use the implicit OR, but weighted to match those with more of the search words and also weighted to favor those with the largest partial exact phrase matches.
Anyways, I really appreciate the request and wish it good luck!
So this is my current understanding: And apologies if I’m repeating what you all were saying the whole time. Search results are only by the logic of whether or not that search is true or not. And then sorted by your sort settings. (using block:(), line:(), or section:() operators can help narrow down by proximity. But it still doesn’t change the sorting.)
For example, if I search “anything can be everything” (no quotes), I have a book with that exact phrase. My results are all notes that contain ALL those words, which is logically correct. But they do not get sorted by whether they are contiguous or not, or how close in proximity they are. They only get sorted by the sorting option in the search. So that book does not show at the top of my results.
I’m still not sure about Omnisearch. It has some relevancy sorting, but I can still get results that don’t weight by order/proximity inside content. It seems more weighted by title and header.
As someone who used to work in text retrieval, I disagree because as I understand this it is a request to treat space as an word adjacency operator not as an implicit OR. The other suggestions that word adjacency hits are sorted before AND-in-the-same-document. before either but not both opearands are found, before only on term is found are not also a request for relevance ranking. Unless one knows and understands the algorithm being used relevance ranking is rank.