Extract highlights from the text and put them at the beginning of the same note

I love the app. As a lazy cat, I like to automate the dull part of my work as much as possible. I use the app for my notes but even more for tons of articles I read and highlight. While it was great to make profiles of people or issues and dataview quotes from across the whole base I collected and processed, I’m eager to solve the following issue.

I highlight ideas in a note with Highligtr (like <mark style="background: #FBC119;"> </mark>). I don’t mind using a native one (== ==) if it will be easier to get what I want. And I want a dataview/dataviewjs/js that extracts highlights from a note and puts it at the beginning of the same note as a sort of summary/key takeaways.

I tried to use regexmatch(file.text, “<mark style="background: #FBC119;">(.*?)”) but was unable to succeed.

The Highligtr version is preferable, for it will enable to have different colours for different ideas (like yellow for ideas, red for questionable thoughts, and green for explanations). There’s an Extract Highlights plugin, but it implies some manual work instead of a code in a template that runs the function without attracting my attention.

Where did you try to use that regexmatch() of yours? And does it really have the full content of the file available in file.text?

I’ve got two variants which I would consider in a similar use case, and that would be decorated tasks, or inline fields. Both of these would with dataview be easy to gather up a list of at the top of the document without disturbing/rewriting the file itself.

Using tasks

In minimal theme, and quite a few others, you can insert various different status characters, and have them styled for various purposes. This could be used to highlight stuff, but it’ll need for text to be extracted into their own line, like in:

- [I] Use a bucket when collecting water
- [?] Can you boil water in a paper bag?
- [!] You can boil water in a paper bag since the water will gather the heat faster than the paper...

Which will display as:

These could be gathered in a query like (where you could group and sort according to your requirements):

```dataview
TASK
WHERE file = this.file
  AND contains(list("?", "!", "I"), status)
GROUP BY status
```

A benefit of this variant, is that if your document is long this queried task list will provide a link back to the exact place where it was defined, so you could quickly see the context of that particular highlight.

Using inline fields

Another option I would consider would be to use inline fields with custom styling. This could look something like:

Someone said that [idea:: collecting water in a bucket] would be the best way forward. I argued that a paper bag could also work. Someone then asked [question:: can you boil water in a paper bag], to which I responded that [explanation:: you can boil water in a paper bag since the water will gather the heat faster than the paper... ]. Interesting they said.

Which with the CSS from below, and a switch to light mode (since it looked terrible in dark mode :smiley: ):

This used this CSS:

body .dataview .inline-field-standalone-value, 
body .dataview .inline-field-value {
  background: var(--color-base-00); /* */
  /* backdrop-filter: brightness(10%); /* */
  font-size: var(--font-adaptive-normal);
  color: var(--text-normal);
  
} /* */

.inline-field-key[data-dv-key="idea"],
.inline-field-key[data-dv-key="explanation"],
.inline-field-key[data-dv-key="question"] {
 display: none;
}

.inline-field-key[data-dv-key="idea"] + .inline-field-value {
  background-color: var(--color-yellow);
  font-style: bold;
}

.inline-field-key[data-dv-key="question"] + .inline-field-value {
  background-color: var(--color-red);
  font-style: italic;
}
.inline-field-key[data-dv-key="explanation"] + .inline-field-value {
  background-color: var(--color-green);
  font-style: italic;
}

You could (and should :smiley: ) change the styling to match your preferences.

The rude and crude version to get all of these highlights:

```dataview
LIST WITHOUT ID anElement
WHERE file = this.file
FLATTEN flat(list(idea, question, explanation)) as anElement
```

Which would just lump everything together. With a little more care, you could either make separate lists for each of the variants, or manipulate the field collation using something like the following:

```dataview
LIST WITHOUT ID anElement
WHERE file = this.file
FLATTEN flat(list(
  map(flat(list(idea)), (m) => "Idea: " + m), 
  map(flat(list(question)), (m) => "Question: " + m),
  map(flat(list(explanation)), (m) => "Explanation: " + m)
  )) as anElement
SORT anElement
```

( If you need, I could try to explain this monstrosity a little in another post. The main trick is that doing flat(list( ... )) on something ensures that even it’s just one element, we work with it as a single level list )


In summary I think my preferred method would be to use tasks since they provide links back to their definition, but they do require splitting out their text on a separate line. Alternatively use the inline fields (with some better styling :slight_smile: ).

I wouldn’t use Highlightr, since a query then would require the reading of the full file, instead of using cached data like the solutions above, which would be slightly more expensive. Not to mention the hassle with getting the regex correct.

2 Likes

You could also use a query like this, if you don’t mind not using dataview :

```query
/(?<=<mark\sstyle="background:\s#FBC119;">)(.*)?(?=<\/mark>)/ file:"17-01-2024 me"
```
3 Likes

Thank you for your reply (time and labour). The task variant seems more suitable. If I find no dataview or JS, I’ll stay with your proposal.

The variant with inline hashtags (idea:: ) is one that I considered. But I use it for collecting quotes from across the whole base into profiles of people and organizations (see my template below). In my case, the problem is that the Obsidian (seems?) doesn’t allow multiple tags to the same paragraph.

BTW is there a way (function) to get a surname from YAML and add “Q_” before it and feed to a dataview query in the “Quotes” section? It will automate the only step I still do manually.

Once again, many thanks for your help!


Type: Persona
Name:
Surname:
Middle_Name:
Job_Title:
Department:
Organisation:
DOB:
BirthPlace:
Type_of_contact:
Phone(main):
Phone(work):
Messenger:
e-mail(private):
e-mail(office):
URL:
Twitter:
Address(work):
Address(home):
Connected_people:
Connected_organisation:
Connected_places:
tags:
Note:

Biography

| 300 | |

| :— | ---- |

Log

TABLE WITHOUT ID file.link as "Meeting/Talk", dateformat(file.day, "MMM d, yyyy") as "Date"
FROM [[#]]
WHERE EventOwner=[[me]]
SORT Date Desc

Mentions

TABLE WITHOUT ID file.link as "Article", dateformat(file.day, "MMM d, yyyy") as "Date"
FROM [[#]]
WHERE file.name != this.file.name
SORT Date Desc

Quotes

TABLE WITHOUT ID Q, link(file.link) as "Note", dateformat(file.day, "MMM d, yyyy") as "Date"
FROM [[#]]
WHERE Q and file.name != this.file.name
SORT Date Desc
``

Thank you for your idea. It works well but I’m looking for a more elegant solution. A query shows not only the text I want but the as well. It’s not sexy on display and even more in a PDF.
Anyway, thank you very much! I greatly appreciate your help.