WordPress to Jekyll: converting gallery shortcodes


As I move along with my Jekyll/Octopress transition, I’m working to make the move as clean as possible. I’m importing my WordPress database rather than starting fresh, and I’ll be sharing tidbits of discoveries as I go. These posts will only be of interest to people making similar transitions, but they’ll also be serving as notes for myself and Google search results for people up against the same conundrums.

I have heavily extended the Octopress Rakefile, and built an almost entirely-new WordPress import module. The importer now…

  • converts my Download Monitor shortcodes to actual download links, respecting the format parameters and including descriptions, versions and titles.
  • generates an .htaccess file with redirects from my old permalink structure for every post it imports.
  • replaces shortcodes from my gist plugin with Octopress formatting
  • replaces [ caption] and <img> tags, maintaining classes, alt and title attributes and alignment settings
  • replaces YouTube shortcodes with YouTube embed code
  • updates multiple formats of code blocks to standard fenced code with language specifier where it finds one
  • replaces video and audio shortcodes with Octopress format and HTML5 embed, respectively.
  • strips out some extra markup I used to compensate for elements of my WordPress theme
  • gathers slug, redirect alias, tags, categories and a custom “series” plugin data as YAML front matter
  • locates WordPress gallery shortcodes and replaces them with all of the included attachments as an unordered list of thumbnails linked to their full size images

It’s that last item that I’ll share today. The input is any content that includes [ gallery] code (with optional extra parameters). The output is Markdown, with some extra Kramdown syntax. I think it might also work with Maruku, but you may have to adjust depending on your chosen Jekyll Markdown interpreter.

Basically, if it detects [ gallery] codes in the post, it runs a query for all “attachment” posts with the current post as the parent. It passes those to a function that replaces the single [ gallery] code with a full Markdown list of the images, using WordPress’ automatically generated thumbnails as the visible image, and linking them to the full-size upload. If the 150x150 thumbnail doesn’t exist, it uses sips to create it. If you want to alter the thumbnail process, see the sips commands in the thumbnail_image function.

This snippet is added as part of the WordPress module in lib/jekyll/migrators/wordpress.rb. I’ve rebuilt it completely in a new module. Because so much of the code is specific to my own plugins and content, I probably won’t post the entire file, but I’ll pull out the useful bits for incorporation into your own.

To kick it off, here’s the code for the [ gallery] replacer: