sweet; if you get it working, let me know and I’ll update the OP. Again, so sorry :bowing_man:

Will do.
No apology needed at all. Thank you for your help!

Ok, i was able to make some progress with: powershell 'C:\Program Files\Tesseract-OCR\tesseract.exe' '%ocr_input%'
I think this should work for user function.

The current problem now i think is the ocr_input is giving the obsidian path instead of system path.

Thanks. That’s interesting. The relative obsidian path works okay on my mac, but that must not work on windows. You could change your system command to include the path to your obsidian vault, e.g. powershell 'C:\Program Files\Tesseract-OCR\tesseract.exe' 'C:\Users\...\???\Documents\notes\%ocr_input%' … (guessing at a workaround here)

1 Like

Yep, it does work!
The console showing the correct path.
However, Obisidian link seperate folder structure with / while window seperate folder with \.
Is there a way to replace the forward/backward slash in ocr_input?

you could try modifying the OCR template to look like this (notice the replaceAll addition)

---
creation date: <% tp.file.creation_date() %>
tags: [OCR]
---

<%*
const supportedFileTypes = ["jpeg", "jpg", "png"];
const images = this.app.vault.getFiles().filter((item) => supportedFileTypes.indexOf(item.extension) >= 0)
const target = await tp.system.suggester((item) => item.path, images, true);
const out = await tp.user.ocr({ocr_input: target.path.replaceAll("\/", "\\")});
%><%* tR += out %>

Thank you so much for the template but I can’t seem to get it working. I’m getting the following error:

Template parsing error, aborting. Eta Error: Bad template syntax

Invalid or unexpected token
===========================
var tR='',__l,__lP,include=E.include.bind(E),includeFile=E.includeFile.bind(E)
function layout(p,d){__l=p;__lP=d}
let _prs = [];
_prs.push(tp.file.creation_date());
_prs.push(\*
const supportedFileTypes = ["jpeg", "jpg", "png"];
const images = this.app.vault.getFiles().filter((item) => supportedFileTypes.indexOf(item.extension) >= 0)
const target = await tp.system.suggester((item) => item.path, images, true);
const out = await tp.user.ocr({ocr\_input: target.path}););
_prs.push(\* tR += out);
let _rst = await Promise.all(_prs);
tR+='--- \ncreation date: '
tR+=_rst[0]
tR+=' \ntags: [OCR] \n--- \n'
tR+=_rst[1]
tR+=_rst[2]
if(__l)tR=await includeFile(__l,Object.assign(tp,{body:tR},__lP))
if(cb){cb(null,tR)} return tR

    at EtaErr (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:288:15)
    at compile (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:851:19)
    at handleCache (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:1094:68)
    at eval (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:1140:33)
    at new Promise (<anonymous>)
    at render (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:1138:24)
    at renderAsync (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:1172:12)
    at TemplateParser.eval (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:1787:34)
    at Generator.next (<anonymous>)
    at eval (eval at <anonymous> (app://obsidian.md/app.js:1:1), <anonymous>:52:71)

Any idea how I can fix it?

Not really. There is a lot of missing context: what platform you are on, have you verified there aren’t any copy paste errors, etc. you can also try commenting out parts of the template to see where the error is from on your setup.

I’ve managed to get this working on a windows 10 machine.

  1. added tesseract to path environment variable, confirmed tesseract is working at command prompt.
  2. user function in templater called ocr_win : powershell tesseract "%ocr_input%" -
  3. template like this:
---
creation date: <% tp.file.creation_date() %>
tags: [OCR]
---

<%*
const supportedFileTypes = ["jpeg", "jpg", "png"];
const images = this.app.vault.getFiles().filter((item) => supportedFileTypes.indexOf(item.extension) >= 0)
const target = await tp.system.suggester((item) => item.path, images, true);
const out = await tp.user.ocr_win({ocr_input: target.path.replace('/', '\\')});
%><%* tR += out %>
3 Likes

can it recognize Chinese?
cause I tried I think it can’t

Or need other settings?

For that you should check the Tesseract user manual Tesseract User Manual | tessdoc

Thanks man

supposedly there is support for it but if it’s not working out of the box, I’m not sure what the next step is. I believe there is a way to use custom training data or data files from this SO post, but it looks a little dated. @fred1357944

Thank you so much.

I am trying a lot to get this running, but the problem lies with the fact that somehow it has problems recognizing tesseract as a function in powershell. If i do it manually, I can get it to run.

I added

C:\Program Files\Tesseract-OCR\

to Path Variable.

Copied your template, still an error message. Getting frustrated right now, would you be able to expand a bit what exactly you did?

what exact error does it give?

The environment variables are a pain in the butt, too. On the windows machine I tested with, there were two Path variables. I added to both…

(i had trouble initially because of the way windows forms the paths, backslashes/forwardslashes. Do you have the target.path.replace pattern correct in the template?)

Hiya, I’m not a techie but I’ve been trying to get this to work on Windows 10. When I tried to run the term tesseract in powershell, it gave me an error that looked something likes this (btw I’m aware of the misspelling - it’s just a representation).

Seemingly the powershell had a similar error to Obsidian(?) So I tried to figure out how to run it first there.

Aferwards, I found this medium article which explains how to add System Variables. It looks like it’s working in poweshell for me now! But not in Obsidian (the red image is the error I am still getting).

Anyway, not sure if this was the obvious thing to do, just thought I’d throw in what little I found in case it helps someone :woman_shrugging:. Would be really cool to get it to work in the future.

@joschmit, seems like it could be what you are having issue with as well.

1 Like

This is weird. I successfully installed & used this on one machine (running Xubuntu 18.04). Synced up (including plugins & settings) on my laptop running Mint 20 & I when I try to create a new note from the OCR template I get the following error in the Obsidian console:


main.ts:213 Error with User Template ocr Error: Command failed: /usr/bin/tesseract “$ocr_input” -
/bin/sh: line 1: /usr/bin/tesseract: No such file or directory

at ChildProcess.exithandler (child_process.js:317)
at ChildProcess.emit (events.js:315)
at maybeClose (internal/child_process.js:1048)
at Socket.<anonymous> (internal/child_process.js:439)
at Socket.emit (events.js:315)
at Pipe.<anonymous> (net.js:673)

I’ve verified by copying the tesseract path & pasting into a terminal that it /usr/bin/tesseract is correct. Any suggestions about why it can’t be found?

For anyone here, I would love to get a real plugin created that’s more usable: Searchable OCR

I ended up needing to set the User Function path to

/opt/local/bin/tesseract "$ocr_input" -

If you’re not sure where your tesseract got installed, you can run

type -a tesseract

and the terminal will return the path.

The OCR works pretty well with printed material, completely unusable for handwritten notes, IMO.