Converting HTML tables to Obsidian Table with Pandoc?

Hello,

I am trying to convert HTM files with tables into markdown documents that work in Obsidian using Pandoc.

I am using the following Pandoc command

pandoc -f html -t markdown testfile.html -o testfile.md

I get a converted table with a markdown format that does not seems to be exactly the one which works within Obsidian… the table from pandoc looks like…

+-----------------------+-----------------------+-----------------------+
| ## Balance Sheet 202  |                       |                       |
| 1-12-31..2022-12-31,  |                       |                       |
| valued at period ends |                       |                       |
+=======================+=======================+=======================+
|                       | 2021-12-31            | 2022-12-31            |
+-----------------------+-----------------------+-----------------------+
|                       |                       |                       |
+-----------------------+-----------------------+-----------------------+

and it does not work in Obsidian.

And clue why ?

We can’t repro since you didn’t share the input file.

Obsidian uses CommonMark and supports Github-flavored-markdown style tables (the CommonMark spec doesn’t specify syntax for tables, but many implementations like GFM and Obsidian support extended syntax for it).
markdown with pandoc indicates pandoc’s version of markdown which is a different dialect. Try either of the following and you should get tables that obsidian can consume.

pandoc -f html -t gfm testfile.html -o testfile.md
pandoc -f html -t markdown_mmd testfile.html -o testfile.md

See the Options section of the pandoc man page for a full list of output targets.

1 Like

very interesting, is this makes a difference in other conversions as well? i am trying to convert my epub books into obsidian vaults, trying to get the best results

I would expect so, yes.

1 Like

I tried your examples and tried also most options vaguely related to markdown with Pandoc export formats and none seems to produce an Obsidian compatible table format.
I did not known before that markdown was so poorly standardized :face_with_raised_eyebrow:

I did try my example on some sample html containing a table and it did work on that input. You’ll have to share your sample input (as indicated above) if you want further help troubleshooting why you aren’t getting the expected result. Cheers.

As @pmbauer said, you should provide your sample input. I tried to convert an example html table on an online version of pandoc and both gfm and markdown_mmd output format worked. See Try pandoc! to view the example.

The content of the input file is sensitive. I can’t share it.

But I found another method, exporting in CSV and using Pandoc to convert CSV to markdown_mmd tables. It works…

Thanks for the help !

@Yannick Pandoc supports different table syntaxes. If not exactly specified, which syntax to use (or not to use) via extensions (e.g. -t markdown-grid_tables), pandoc will chose the right syntax depending on table content. Since some table cells in your example contain block-level elements (heading), pandoc uses grid table syntax in this particular case.

1 Like

You can always redact the cell contents and create a minimal repro. pandoc does generate tables in a form that Obsidian can read given well formed input and switches. Glad you have a work-around.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.