Basic OCR in Obsidian

pmbauer · May 12, 2021, 4:10pm

UPDATE: 2022-08-02

The instructions no longer work out-of-the-box with the current version of templater. Templater 1.7.x seems to work, but the latest 1.12.x appears to have broken the line containing tp.user.ocr. I don’t have time to troubleshoot at the moment. Sorry.

OCR Templater Script for Obsidian

Optical Character Recognition (OCR) Templater Script for Obsidian.

Video Demonstration

Installation

Pre-requisites

Obsidian
Templater plugin.

Install the Tesseract OCR engine.

Prebuilt Binaries for Linux and Windows are available. On Mac: brew install tesseract.

Create Templater System Command for Tesseract

Open Settings > Templater
Turn on Enable System Commands
Add a User Function named ocr

/usr/local/bin/tesseract "$ocr_input" -

NOTE I reccomend increasing the Templater Timeout to 10s or more as the OCR may take a moment.

Add OCR.md template

Add the OCR.md template to your Templater templates folder.

---
creation date: <% tp.file.creation_date() %>
tags: [OCR]
---

<%*
const supportedFileTypes = ["jpeg", "jpg", "png"];
const images = this.app.vault.getFiles().filter((item) => supportedFileTypes.indexOf(item.extension) >= 0)
const target = await tp.system.suggester((item) => item.path, images, true);
const out = await tp.user.ocr({ocr_input: target.path});
%><%* tR += out %>

Usage

Invoke Templater: Create new note from template Pallet command
Choose OCR
Choose an image file in your vault. This may take a few seconds.

video demonstration

GreenChocho · May 12, 2021, 4:37pm

Could you please help guide me on what templater command I should put for window?
i tried: 'C:\Program Files\Tesseract-OCR\tesseract.exe' "$ocr_input" - but console saying syntax is incorrect.

pmbauer · May 12, 2021, 4:40pm

Hey, sorry. I don’t have windows so not entirely sure what to put there. If you have access to cygwin or the Windows Subsystem for Linux, you might be able to get that working by installing tesseract in cygwin or WSL and setting your Templater system shell to bash, but I’m not able to test that out

pmbauer · May 12, 2021, 4:42pm

My guess is the default shell in windows is cmd.exe and you might try ’C:\Program Files\Tesseract-OCR\tesseract.exe' "%ocr_input%" … likely have to play with tesseract at the command line to get the right command

GreenChocho · May 12, 2021, 4:44pm

not correct yet, but i think try googling cmd command will help. Thank you so much!

pmbauer · May 12, 2021, 4:45pm

sweet; if you get it working, let me know and I’ll update the OP. Again, so sorry

GreenChocho · May 12, 2021, 4:49pm

Will do.
No apology needed at all. Thank you for your help!

GreenChocho · May 12, 2021, 5:57pm

Ok, i was able to make some progress with: powershell 'C:\Program Files\Tesseract-OCR\tesseract.exe' '%ocr_input%'
I think this should work for user function.

The current problem now i think is the ocr_input is giving the obsidian path instead of system path.

pmbauer · May 12, 2021, 8:13pm

Thanks. That’s interesting. The relative obsidian path works okay on my mac, but that must not work on windows. You could change your system command to include the path to your obsidian vault, e.g. powershell 'C:\Program Files\Tesseract-OCR\tesseract.exe' 'C:\Users\...\???\Documents\notes\%ocr_input%' … (guessing at a workaround here)

GreenChocho · May 13, 2021, 8:21am

Yep, it does work!
The console showing the correct path.
However, Obisidian link seperate folder structure with / while window seperate folder with \.
Is there a way to replace the forward/backward slash in ocr_input?

pmbauer · May 13, 2021, 4:33pm

you could try modifying the OCR template to look like this (notice the replaceAll addition)

---
creation date: <% tp.file.creation_date() %>
tags: [OCR]
---

<%*
const supportedFileTypes = ["jpeg", "jpg", "png"];
const images = this.app.vault.getFiles().filter((item) => supportedFileTypes.indexOf(item.extension) >= 0)
const target = await tp.system.suggester((item) => item.path, images, true);
const out = await tp.user.ocr({ocr_input: target.path.replaceAll("\/", "\\")});
%><%* tR += out %>

vvk · May 15, 2021, 2:00pm

Thank you so much for the template but I can’t seem to get it working. I’m getting the following error:

Template parsing error, aborting. Eta Error: Bad template syntax

Invalid or unexpected token
===========================
var tR='',__l,__lP,include=E.include.bind(E),includeFile=E.includeFile.bind(E)
function layout(p,d){__l=p;__lP=d}
let _prs = [];
_prs.push(tp.file.creation_date());
_prs.push(\*
const supportedFileTypes = ["jpeg", "jpg", "png"];
const images = this.app.vault.getFiles().filter((item) => supportedFileTypes.indexOf(item.extension) >= 0)
const target = await tp.system.suggester((item) => item.path, images, true);
const out = await tp.user.ocr({ocr\_input: target.path}););
_prs.push(\* tR += out);
let _rst = await Promise.all(_prs);
tR+='--- \ncreation date: '
tR+=_rst[0]
tR+=' \ntags: [OCR] \n--- \n'
tR+=_rst[1]
tR+=_rst[2]
if(__l)tR=await includeFile(__l,Object.assign(tp,{body:tR},__lP))
if(cb){cb(null,tR)} return tR

    at EtaErr (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:288:15)
    at compile (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:851:19)
    at handleCache (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:1094:68)
    at eval (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:1140:33)
    at new Promise (<anonymous>)
    at render (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:1138:24)
    at renderAsync (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:1172:12)
    at TemplateParser.eval (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:1787:34)
    at Generator.next (<anonymous>)
    at eval (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:52:71)

Any idea how I can fix it?

pmbauer · May 15, 2021, 8:15pm

Not really. There is a lot of missing context: what platform you are on, have you verified there aren’t any copy paste errors, etc. you can also try commenting out parts of the template to see where the error is from on your setup.

drgraham · June 19, 2021, 1:25am

I’ve managed to get this working on a windows 10 machine.

added tesseract to path environment variable, confirmed tesseract is working at command prompt.
user function in templater called ocr_win : powershell tesseract "%ocr_input%" -
template like this:

---
creation date: <% tp.file.creation_date() %>
tags: [OCR]
---

<%*
const supportedFileTypes = ["jpeg", "jpg", "png"];
const images = this.app.vault.getFiles().filter((item) => supportedFileTypes.indexOf(item.extension) >= 0)
const target = await tp.system.suggester((item) => item.path, images, true);
const out = await tp.user.ocr_win({ocr_input: target.path.replace('/', '\\')});
%><%* tR += out %>

fred1357944 · June 25, 2021, 2:22am

can it recognize Chinese?
cause I tried I think it can’t

Or need other settings?

pmbauer · June 25, 2021, 3:50am

For that you should check the Tesseract user manual Tesseract User Manual | tessdoc

fred1357944 · June 25, 2021, 3:50am

Thanks man

pmbauer · June 25, 2021, 3:57am

supposedly there is support for it but if it’s not working out of the box, I’m not sure what the next step is. I believe there is a way to use custom training data or data files from this SO post, but it looks a little dated. @fred1357944

fred1357944 · July 5, 2021, 6:39am

Thank you so much.

joschmit · July 8, 2021, 11:28am

I am trying a lot to get this running, but the problem lies with the fact that somehow it has problems recognizing tesseract as a function in powershell. If i do it manually, I can get it to run.

I added

C:\Program Files\Tesseract-OCR\

to Path Variable.

Copied your template, still an error message. Getting frustrated right now, would you be able to expand a bit what exactly you did?