Custom Processors in Marked 2 - BrettTerpstra.com

Custom processors are an advanced feature of Marked which provide a lot of power and flexibility — if you know how to write scripts and implement them. For the average user they’re of less use, but for the power users they’re a goldmine.

You can write scripts to function as preprocessors (run before one of the default internal processors), or processors (replaces the internal processors entirely). Anything that can take input on STDIN and return HTML on STDOUT will work, allowing you to use Marked with any form of markup or special syntax for a specific processor such as Kramdown, Maruku, Python Markdown 2, etc..

I generally write my scripts in Ruby, so the following tips will use Ruby in the examples, but the concepts will work with any language. You can even write custom processors in Bash or ZSH and pass the input to any executable you can call from the command line.

Preprocessor vs. Processor

A Custom Processor is a replacement for the built-in MultiMarkdown and Discount processors. It can be a path to an executable, or to a wrapper script that makes use of various logic to process extra markup or determine which processor to run based on variable criteria.

Preprocessors come before processing, as the name implies. From the Marked 2 documentation:

If you set up a preprocessor, it is run after Marked handles any Marked-specific tasks such as including external documents and code, but before it runs the processor (internal or custom). This gives you a chance to render custom template variables, handle substitutions or inject your own content by any other means.

Environment

First, be aware that the processors are not running in your usual shell environment. They’re running in a protected shell that inherits none of your standard settings. Things like $PATH and other environment variables will not be automatically set.

Marked provides several environment variables that you can use, and with systems like RVM, you can write wrapper scripts (rvm help wrapper) as needed. The easiest way to use your preferred language is to install any extensions (in the case of Ruby, gems) at the system level (sudo gem install xxx).

Selectively bypassing a processor

Marked 2 allows you to determine whether the custom pre/processor should run based on any conditions you choose. If you return just the line “NOCUSTOM” instead of HTML output, Marked will act as though you didn’t have a processor set for that stage and default to the next option.

Logging errors

You can easily see what’s going on with your script by using a logger set to a file. I use Ruby’s built-in Logger module to output to a file on my Desktop while I’m debugging. The chunk of code below shows how to do that:

#!/usr/bin/ruby
require 'logger'

@log = Logger.new(File.expand_path("~/Desktop/stderr.log"))
	
begin
	if [condition to run processor]
		puts input
	else
		print "NOCUSTOM"
	end
rescue Exception => e
	@log.error(e)
	print "NOCUSTOM"
end

Encoding

Especially with Ruby 2.0 (included with Mavericks), encoding issues in the input can quickly become problematic. Use this chunk of code in any script to avoid problems:

if RUBY_VERSION.to_f > 1.9
  Encoding.default_external = Encoding::UTF_8
  Encoding.default_internal = Encoding::UTF_8
  content = STDIN.read.force_encoding('utf-8')
else
  content = STDIN.read
end

Conditional processors

You can even run different processors based on various criteria just by forking in your script. For example, I run a custom processor that detects whether or not there is a YAML header on the document. If there is, it runs my custom Jekyll processor with Kramdown, otherwise it returns NOCUSTOM and allows the default MultiMarkdown processor to handle the conversion.

Using Marked’s environment variables, you can pivot based on the file extension (ENV['MARKED_EXT']), the folder of the file (or any part of the path), or anything based on the content of the file. In a preprocessor, this can even include MMD metadata or specific YAML headers that might indicate different processing strategies.

Per-document processor settings

You can force a pre/processor to be used or not used on any document using metadata. If you include the lines:

Custom Preprocessor: true
Custom Processor: false

You would force the file to use your custom preprocessor but the default processor, regardless of settings in preferences (assuming the preprocessor was set up and enabled). These directives can be included within HTML comments, so that they won’t affect other output if you’re not using MultiMarkdown. You can even have your processors strip those lines before returning the output if needed.

Hopefully these tips will help advanced users who’ve run into issues setting up their own custom processors with Marked. As always, if you run into any problems, don’t hesitate to open up a ticket on the support site (public or private).