Inconsistent/not-very-useful tag parsing

TehShrike · July 22, 2020, 9:01pm

This issue has two parts.

First: the tag parsing is too aggressive with its boundary characters. #web_app and #web/viral both get parsed as the #web tag, which doesn’t feel reasonable – why not keep reading until you find whitespace, and maybe a couple other explicit special characters if necessary?

Second: the tag search is pretty useless right now if you have a lot of tags – clicking on a #web tag will bring up files tagged with #webapp, #web_app, etc.

Related: Tag search is inconsistently fuzzy

Bonus request: since searching takes a while, and tags are already cached, it’s pretty disappointing/frustrating that doing a tag search requires a full text search of every page instead of a constant-time filtering based on the previously-parsed tags.