So much new Apex stuff - BrettTerpstra.com

I’ve added so much new stuff to my universal Markdown processor, Apex, that it’s hard to count. It’s developing quickly!

In case you missed the announcement, Apex is my latest project — a command line tool and C library that combines the best of various Markdown tools, supporting syntax from CommonMark, GitHub Flavored Markdown, MultiMarkdown, Kramdown, mmark, Pandoc, and even Marked’s own special syntax.

The first and most important change is that I fixed multiple bugs that caused Apex to hang on some files. In all of my testing, it no longer chokes on any file, no matter how complex. And while the rendering speed is slower than plain cmark, it’s far faster than it was previously, and slots into the middle of the pack for efficiency, with about a 70ms render time on a complex document.

The Big Stuff

Bibliography and Index Support

I’ve added bibliography and index support. Citations work with multiple types of bibliography files, and Apex handles syntax from both Pandoc and MultiMarkdown. And you can create automatic indices for your documents using mmark or TextIndex format (a cool project by Matt Gemmell with a clever syntax that I like a lot).

Given that bibliography support is the #1 reason Marked users need to use Pandoc as an external processor, building citation support into Apex — as well as Pandoc metadata and other features — means a lot of users won’t need an external processor at all. I’m in no way trying to replace or replicate Pandoc entirely; it’s a brilliant tool with a ton of capabilities that Apex does not aspire to to include. I just want to support the features that make the most sense for a universal Markdown tool.

Metadata Improvements

Almost all command line options can now also be controlled with metadata (YAML or MMD), allowing per-document control of options when operating on batches. It also means you can create a single configuration for Apex and then load it with something like apex --meta-file ~/.config/apex/main_config.yml my_document.md, and all of the options defined in main_config.yml will be applied to my_document.md.

MultiMarkdown allows you to include metadata in a document with [%key] syntax. Apex supports this, and adds on to it by allowing “transforms” — string transforms like [%title:title] or [%source:urlencode], and array transforms like [%tags:split:join(, )]. Transforms also include date parsing and transformation with strftime formatting, e.g. [%date:format(%B)]. This feature allows you to use metadata in a document, or common metadata across multiple files, to control text generation.

Advanced File Inclusion/Transclusion

I’ve added some cool features to file inclusion, including “addressing”, inspired by mmark’s syntax. This allows you to include only parts of a file, based on line number ranges ({{filename.sh[START,LEN]}}) or regular expression searches ({{filename.md[/^## Introduction/,/^## Conclusion/]}}). You can also prefix every line of an included file ({{quotes.md[139,15;prefix:> ]}}).

The Future

I don’t have any specific next steps planned as far as feature additions — I want to see if I can get enough people to start using it that I get feature requests and feedback to guide development. Let me know what you want to see. It’s one of those projects that I start dreaming about around 1am and then end up awake and adding new features to by 3am.

I plan to add Apex to Marked. It will start as an optional “beta” processor, in addition to the built-in MultiMarkdown, Kramdown, CommonMark, and Discount processors, but I’d like to eventually replace all of the processor options with Apex and just allow it to run in different compatibility modes, with the “unified” mode being the default so users don’t even have to think about which processor they need. Marked will just render all syntaxes as expected. And having Apex available as a command line tool that supports relevant parts of Marked’s special syntax will mean users can get results similar to Marked from the command line and other automations.

If you have feature requests or questions about implementation, I’d love to hear about it in a GitHub Issue, or if you’re not a GitHub user, feel free to use the forum. I’m open to any and all feature suggestions, in the interest of making this as universal as possible.

Give It a Shot

If you want to try Apex out, you can find installation instructions on the Apex project page, and the full documentation in the wiki. Apex can be run as a command line tool (which you can build from source, or just download a pre-built binary for macOS or Linux), and there’s a full C API that can be used in your own projects, including an Xcode framework.

Apex is open source and code contributions are welcome. If you want to contribute money instead, donations are always appreciated.

Highlights

You can view the full changelog on GitHub.

Changed

HTML comments now replaced with “raw HTML omitted” in CommonMark and GFM modes by default
Tag filter (GFM security feature) now only applies in GFM mode, not Unified mode, allowing raw HTML and autolinks in Unified mode as intended.
Wiki links are now disabled by default in all modes (previously enabled in unified mode). Use --wikilinks to enable.
Change --enable-includes to --[no-]includes, allowing --no-includes to disable includes in unified mode and shortening the flag

New

Comprehensive test suite (580+ tests passing)
Control command line options with metadata
- String options via metadata (bibliography, csl, title, style/css, id-format, base-dir, mode)
- Boolean metadata values accept true/false, yes/no, or 1/0 (case-insensitive, downcased)
MultiMarkdown-style superscript (^text) and subscript (~text) syntax support
- --[no-]sup-sub command-line option to enable/disable superscript/subscript
- Differentiates between subscript (tildes within a word, e.g., H~2~O) and underline (tildes at word boundaries, e.g., ~text~) by checking if tildes are within alphanumeric words or at word boundaries.
- Superscript/subscript enabled by default in unified and MultiMarkdown modes
--[no-]unsafe command-line option to control raw HTML handling
Added preprocessing for angle-bracket autolinks (<http://...>) to convert them to explicit markdown links, ensuring they work correctly with custom rendering paths.
--[no-]autolink CLI option to control automatic linking of URLs and email addresses. Autolinking is enabled by default in GFM, MultiMarkdown, Kramdown, and unified modes.
Underline syntax support: ~text~ now renders as <u>text</u> when there’s a closing ~ with no space before it.
Man page
--obfuscate-emails flag to hex-encode mailto links.
--[no-]wikilinks to enable/disable wiki link processing
Definition lists now work inside block quotes (e.g., > Term\n> : Definition)
Nested block quotes with definition lists (e.g., > > Term\n> > : Definition)
File includes now support address syntax for line ranges, regex ranges, and prefixes
- Line number ranges: [3,5] includes lines 3 to 5 (exclusive)
- Line range to end: [3,] includes from line 3 to end of file
- Regular expression ranges: [/START/,/END/] includes lines between regex matches
- Prefix option: [3,5;prefix="C: "] prefixes each included line
- Address syntax works with `` and <<[file] syntaxes (not iA Writer /file)
- POSIX regex support for pattern-based line extraction
Metadata variable transforms with [%key:transform] syntax
- --[no-]transforms to enable/disable transforms
- 19 text transforms: upper, lower, title, capitalize, trim, slug, replace (with regex support), substring, truncate, default, html_escape, basename, urlencode, urldecode, prefix, suffix, remove, repeat, reverse, format, length, pad, contains
- Array transforms: split, join, first, last, slice
- Date/time transform: strftime with date parsing
- Transform chaining support (multiple transforms separated by colons)
--meta-file flag to load metadata from external files (YAML, MMD, or Pandoc format, auto-detected)
--meta KEY=VALUE flag to set metadata from command line (supports multiple flags and comma-separated pairs)
Add metadata merging with proper precedence: command-line > document > file
--embed-images flag to embed local images as base64 data URLs in HTML output
--base-dir flag to set base directory for resolving relative paths (images, includes, wiki links)
Automatic base directory detection from input file directory when reading from file
Add citation processing with support for Pandoc, MultiMarkdown, and mmark syntaxes
- Add bibliography loading from BibTeX, CSL JSON, and CSL YAML formats
- Add –bibliography CLI option to specify bibliography files (can be used multiple times)
- --csl CLI option to specify citation style file
- --no-bibliography CLI option to suppress bibliography output
- --link-citations CLI option to link citations to bibliography entries
- --show-tooltips CLI option for citation tooltips
- Bibliography generation and insertion at  marker
- Support for bibliography specified in document metadata
Index extension
- mmark syntax (!item), (!item, subitem), and (!!item, subitem) for primary entries
- TextIndex syntax with {^}, [term]{^}, and {^params} patterns
- Automatic index generation at end of document or at  marker
- Alphabetical sorting and optional grouping by first letter for index entries
- Hierarchical sub-item support in generated index
- --indices CLI flag to enable index processing
- --no-indices CLI flag to disable index processing
Support for MultiMarkdown’s standard metadata fields
- Base Header Level and HTML Header Level metadata to adjust heading levels
- CSS to link external stylesheets in standalone HTML documents
- HTML Header and HTML Footer metadata to inject custom HTML
- Language to set HTML lang attribute in standalone documents
- Quotes Language metadata to control smart quote styles (French, German, Spanish, etc.) This falls back to Language setting.
- Metadata key normalization: case-insensitive matching with spaces removed (e.g., “HTML Header Level” matches “htmlheaderlevel”)
Add --css CLI flag as alias for --style with metadata override precedence

Improved

Processing speed improved by 15x
- Only process citations when bibliography is actually provided for better performance
- Add comprehensive performance profiling system (APEX_PROFILE=1) to measure processing time for all extensions and CLI operations
  - Add profiling instrumentation for all preprocessing, parsing, rendering, and post-processing steps
  - Add profiling instrumentation for CLI operations (file I/O, metadata processing)
- Early exits
  - IAL processing when no {: markers are present
  - Index processing when no index patterns are found
  - Citation processing when no citation patterns are found
  - Definition list processing when no /^: / patterns are found
  - Wiki link AST traversal entirely when no wiki link markers are present in the document.
  - Alpha lists post-processing when no markers are present
- Optimize alpha lists post-processing with single-pass algorithm replacing O(n*m) strstr() loops
- Optimize file I/O by using fwrite() with known length instead of fputs()
- Add markdown syntax detection in definition lists to skip parser creation for plain text
- Optimize definition lists by selectively extracting only needed reference definitions instead of prepending all

Fixed

IAL markers appearing immediately after content without a blank line are now correctly parsed as separate attribute lists instead of being treated as part of the preceding paragraph. This ensures Kramdown-style IAL syntax works correctly with cmark-gfm parser.
Relaxed tables are now disabled by default for CommonMark, GFM, and MultiMarkdown modes
Header ID extraction no longer incorrectly parses metadata variables like [%title]
Relaxed tables now preserve input newline behavior in output
Unified mode now correctly enables mixed list markers and alpha lists by default when no --mode is specified
^ marker now properly separates lists by creating a paragraph break instead of just blank lines
Empty paragraphs created by ^ marker are now removed from final HTML output
Setext headers are no longer broken when followed by highlight syntax (==text==). Highlight processing now stops at line breaks to prevent interference with header parsing.
Metadata parser no longer incorrectly treats URLs and angle-bracket autolinks as metadata. Lines containing < or URLs are now skipped during metadata extraction.
Metadata variable replacement now runs before autolinking so [%key] values containing URLs are turned into links when autolinking is enabled.
MMD metadata parsing no longer incorrectly rejects entries with URL values; only URL-like keys or ‘<’ characters in keys are rejected, allowing “URL: https://example.com” as valid metadata.
Math processor now validates that \(...\) sequences contain actual math content (letters, numbers, or operators) before processing them. This prevents false positives like \(%\) from being treated as math when they only contain special characters.
Improved autolink functionality to only wrap valid URLs/emails and ensured email autolinks use mailto: hrefs.
Reference-style markdown links now convert correctly in definition list terms (dt tags) and definitions (dd tags). Previously, reference-style links like [text][ref] were not being resolved when used in definition list terms or definitions, even when the reference definition was present elsewhere in the document.

So give it a whirl and keep me posted! Again, feel free to help guide development by opening Issues or joining in on the forum!

apex, developer, html, markdown, marked, multimarkdown, tools, wiki