Find highlighted text in my vault

Hello,

I do not reach to set a query to extract my highlighted text
eg the text between <span style="background:#fff88f"> and </span>
ideally I want to create a dataviewjs with my search and some extracted data for the note…

I edited your post to make your example codes visible — you need to mark stuff like that as code to make sure we can read it.

This post provides a starting point you could modify to get what you want:

thanks @CawlinTeffid, I test your proposal but my query do not send response:
/<span style="background:#fff88f">[^(<span style="background:#fff88f">)]*</span>

I think this should work: <span style="background:#fff88f">[^(</span>)]*</span> (but I haven’t tested it).

The middle part is “zero or more characters that aren’t the end tag”; you had the start tag there instead of the end.

nope same effect… :thinking:

Maybe some of the punctuation characters need to be escaped? I don’t remember if any of those do anything in regular expressions, but in the Help site’s page about Search there’s a link to a page that lists the regular expression it uses.

Also make sure you don’t accidentally have curly quotes in the search instead of straight ones. (I don’t see any here.)

I tried different possibilities but without success…I change my strategy and i will use an inline field it’s not so practical but more sure for working.
Thanks for your help :wink:

I found something we overlooked — you need to escape the slashes. I realized it after I noticed I hadn’t included the slashes at the ends that mark regular expressions for Search. So the actual actual thing to put in the search box (if I haven’t missed something else) is this:

/<span style="background:#fff88f">[^(<\/span>)]*<\/span>/

Here is a screenshot in which I try to search a regular expression containing a slash without escaping it. Notice how the text under the search box describes 2 search patterns instead of 1. That’s because the regular expression ends at the second slash, and everything after is considered a new search pattern.

Why do you feel inline fields are less practical compared to a <span> tag? When looking at this example:

You could either <span style="background: #fff88f">span html
tags</span>, or [highlight:: inline fields], or ==markdown 
highlight== to highlight bits and pieces of text.

I feel that next to using == around the text, then using inline fields is a neater option. In the answer linked below I showcase how this could be done for multiple variants of highlights related to ideas, questions and explanation with the CSS snippet to make it work.

And here are some pros and cons for the various options:

  • Using span html tags:
    • Span allows for arbitrarily styling within the note itself
    • Some don’t like that excessive markup to get that styling of spans
    • You need to use regex searches, or javascript content reading scripts to extract the highlights
  • Using ==highlight==:
    • Included by default
    • Can be styled, but there is just “one” type of styling across the vault by default
    • Very unobtrusive in the text, but is still visible enough in source mode
  • Using inline fields:
    • Not very obtrusive in the text
    • Allows for easy querying across files
    • With multiple fields, you can several groups of highlights, if that’s a desire
    • Requires a one time setup of the CSS for each group of highlight fields
  • Using tasks (as discussed in other text):
    • Is the only option which allows link back from the query result (except for when searching for span html tags)
    • Do require the highlighted text to be one a line of its own
    • Easy to query

hello @CawlinTeffid I have already tried to escape slashes with no success.
If I try with just the first part of the regex, the query sends me results (but not really clean…)
If I try to extract just the text between the first “span” and the second, the regex does not operate.

hello @holroy

You’re right ideally I want to extract from my notes some text important for me:
cites, controverses, ideas…
It’s possible with the classical highlight (eg ==) but If I want to select the type of highlights by color, I need a regex query (one by color) with the span sequence, but clearly it’s difficult to write…
So the last way is by inline field … I use emoji for that and query by dataview is really easy compare a query by regex! But you’re right, for reading it’s better to choose highlights…

Are both ends of the span on the same line? That is, with no returns between the opening and closing tag?

I got this simplified version to work:

/<span style="background:#fff88f">.*<\/span>/

It has this flaw: if you have more than one of these spans on the same line, the search will match from the opening of the first one to the end of the last one, including everything between (that’s what the original middle part was trying to prevent). I believe it’s possible to avoid that problem, but maybe this is good enough.

You’re right the regex passed! You’re right again if consecutively I have two highlights, they are assembled

But wouldn’t the best solution then be to use inline fields with emojis? As shown in that other thread it’s easy to make these fields appear as highlights (with or without the emojis showing).

1 Like

hello @holroy, I’m so dumb…I do not read your proposal…yes the thread offers some idea to improve my process, thanks!

Many thanks @holroy the combination of your css/inlinefield/templater is very useful to manage my highlights.

1 Like