Have Pandoc Recognize Your Literature Links as Citations!

I use the Citations plugin to manage citations.

Within Obsidian, I really would prefer to have links to the literature note rather than citations:

(1) [[@darwin-1856-origin]]

But, of course, for Pandoc to render this citation properly this should be:

(2) [@darwin-1856-origin]

Pandoc recognizes the second as a citation, but wraps the finalrendering with the extra brackets .

The following filters the token stream to strip the extra brackets from the citation.

#! /usr/bin/env python
# -*- coding: utf-8 -*-

from pandocfilters import toJSONFilter, Str, Plain, Para
import sys
import re

def replace_citation_links_with_citation(key, value, format, meta):
    # sys.stderr.write(f"=== {key} ===>\n")
    # sys.stderr.write(f"{value}\n\n")
    elements = []
    if key == "Plain" or key == "Para":
        for item_idx, item in enumerate(value):
            if item["t"] == "Str" and item["c"] == "[":
                if item_idx < len(value) - 1:
                    if value[item_idx + 1]["t"] == "Cite":
                        continue
            elif item["t"] == "Str" and item["c"] == "]":
                if item_idx > 0:
                    if value[item_idx - 1]["t"] == "Cite":
                        continue
            elements.append(item)
        if key == "Plain":
            return Plain(elements)
        elif key == "Para":
            return Para(elements)


if __name__ == "__main__":
    toJSONFilter(replace_citation_links_with_citation)

If saved to a file, e.g. ~/.local/share/pandoc-filters/obisidian-citation-links.py, and invoked when running pandoc by --filter ~/.local/share/pandoc-filters/obisidian-citation-links.py, then something like this:

The best cookies in the world have cardamom in them [[@darwin-1856-origin]] .

will look like the following to Pandoc.

The best cookies in the world have cardamom in them [@darwin-1856-origin] .

So you can now have your citations and link them too.

4 Likes

Thanks for posting this!

1 Like

This is not working for me. Do I need to install any python packages to make it work?

In VS Code, it says:

[{
	"resource": "/C:/Users/FeralFlora/AppData/Local/Pandoc/filters/wikilink-citation.py",
	"owner": "_generated_diagnostic_collection_name_#2",
	"code": {
		"value": "reportUndefinedVariable",
		"target": {
			"$mid": 1,
			"path": "/microsoft/pyright/blob/main/docs/configuration.md",
			"scheme": "https",
			"authority": "github.com",
			"fragment": "reportUndefinedVariable"
		}
	},
	"severity": 4,
	"message": "\"Para\" is not defined",
	"source": "Pylance",
	"startLineNumber": 26,
	"startColumn": 20,
	"endLineNumber": 26,
	"endColumn": 24
}]

On this line:
image

It also says that Sys, Str and Re are imported but not accessed:
image

Typo / mispaste in original code. Note the import now includes Para:

from pandocfilters import toJSONFilter, Str, Plain, Para

Sorry for the confusion!

1 Like

Great, thanks for the answer. However, what about all the things that are not accessed, can I just remove them? Sys, re, str and the format, meta parameters in the function.

image

The function signature (format, meta, etc.) should remain the same. You can (probably) drop the unneeded imports.

1 Like

Great, I’m going to test it now :+1:

There seems to be a bug. This:

Becomes this (note the extra trailing ].

Also, this workflow is incompatible with the wikilinks_title_after_pipe extension, because citations will simply become links :cry:

I am considering adopting the convention:

The quick brown fox jumps over the lazy dog [[some/path/to/library/@author-year-title|[@author-year-title]]]

and just rely on stripping away the links to “reveal” the clean pandoc citation in the display name. This should be possible with the default mediawiki link filters, I think? A little bit of redundancy in typing, but IF pandoc can process the citation correctly, then it will be worth it.

Ok, @mgmeyers “Easy Bake” plugin resolves everything.

In particular:

blah blah [[sources/references/a/@anderson-1979|[@anderson-1979]]] blah

gets nicely expanded to:

blah blah [@anderson-1979] blah 

along with all other links and embeds (!!!) in a single standalone document. Can process with pandoc etc. normally after that.
:partying_face: :partying_face: :partying_face:

Another workaround that I’m personally using is to use Markdown links instead of Wikilinks.

For example:

[@ahrens-2017](ahrens-2017.md)

gets visualized as a reference as well as a link in the reading view.

The link converter plugin can convert between Wikilinks and Markdown links in the whole vault.

What I do now is just use [@citkey] for everything. Not an actual (native) link, Wiki-style or otherwise, but just the regular Pandoc citeproc syntax. Works beautifully for the production/manuscript, as then you can do all these things: [-@citeky], [@citekey1; @citekey2] etc.

“But what about the link function?” you ask?

Yep, you can get that too.

This does NOT mean you have to lose the ability to link to the lit note if you install @mgmeyers 's WONDERFUL Pandoc Reference List:

This augments all your citations to function as first-class legit links, including hovering etc.!!!

It’s the perfect solution IMHO: your markdown documents are full-fledged valid markdown source to Pandoc, with no compromise in expression of the citations, while also full-fledged Obsidian nodes, with no compromise in expression of the links.

You CAN have your :cake: and eat it too :yum: :yum: :yum: !

1 Like

The bug occurs when there is another character right before or after the brackets without a space. So, [[@Foo]]. becomes [1]]. and ,[[@Foo]] becomes ,[[1].

I needed a quick solution and just asked Claude to handle these cases. Works for me:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from pandocfilters import toJSONFilter, Str, Plain, Para
import sys
import re

def replace_citation_links_with_citation(key, value, format, meta):
    if key == "Plain" or key == "Para":
        new_value = []
        skip_next = 0
        for i, item in enumerate(value):
            if skip_next > 0:
                skip_next -= 1
                continue
            
            if item["t"] == "Str" and ("[" in item["c"]) and i + 1 < len(value):
                if value[i+1]["t"] == "Cite":
                    # Split the string at the last [
                    parts = item["c"].rsplit("[", 1)
                    if len(parts) > 1:
                        new_value.append(Str(parts[0]))  # Add text before [
                    
                    new_value.append(value[i+1])  # Add the Cite element
                    
                    # Check for closing ]
                    if i + 2 < len(value) and value[i+2]["t"] == "Str" and value[i+2]["c"].startswith("]"):
                        remaining = value[i+2]["c"][1:]
                        if remaining:
                            new_value.append(Str(remaining))
                        skip_next = 2
                    else:
                        skip_next = 1
                    
                    continue
            
            new_value.append(item)
        
        if key == "Plain":
            return Plain(new_value)
        elif key == "Para":
            return Para(new_value)

if __name__ == "__main__":
    toJSONFilter(replace_citation_links_with_citation)

Maybe it will be useful for someone else also.

Btw, do not forget to install pandocfilters

pip install pandocfilters
1 Like