Pandoc Lua Filter to pass image size


I would like to share this lua filter I made to pass the obsidian image resize workflow to Pandoc.
It should be robust for all(?) cases:

  • space and no space between separator and size
  • space and no space between width and height
  • if only width is passed
    Of course options for sizes should be sound

I am not an expert in Lua, so there might be room for improvement. If you see any holes in my code, suggestions and ideas are welcome!

Enjoy :slight_smile:

pandoc.utils = require 'pandoc.utils'

function Image (img)
  local size_sep = {} -- Init size table (for width x height)
  for i,v in ipairs(img.caption) do -- Loop on the caption table
    caption_string = pandoc.utils.stringify(v) -- stringify every table item
    if string.find(caption_string, "|") then
		index = i
		last_word = string.match(caption_string, "(.*)|") -- Last word before | if no space
		size = string.match(caption_string, "|(.*)") -- We store the size
		if size == '' then -- There is a space between | and size, we look further ahead
			for w in string.gmatch(pandoc.utils.stringify(img.caption[index+2]), "([^x]+)") do
				table.insert(size_sep, w)
			if size_sep[2] == nil then -- There might be a space between width and height
				size_sep[2] = pandoc.utils.stringify(img.caption[#img.caption])
				-- If height is not specified then size_sep[2] = size_sep[1]
		else -- There is no space, we can split directly
			for w in string.gmatch(size, "([^x]+)") do
				table.insert(size_sep, w)
  for i = #img.caption, index, -1 do -- Iterate from end to avoid out of bounds error by successive deletes
	img.caption:remove(i) -- Remove all the caption at and after the separator
  img.caption[index] = pandoc.Str(last_word) -- We put back the last word that have been removed if no space
  if size_sep[2] == size_sep[1] then
	img.attributes.width = size_sep[1] 
  	img.attributes.width = size_sep[1]
	img.attributes.height = size_sep[2]
  return img

Hi, thanks for this, it looks like exactly what I need !
Everything is working fine except the image caption that keeps the "| size" part.
I have done tons of tests without any success. Is your filter still working with the latest Pandoc 3.1.1 ? I start wondering if there is a bug with image captions.