I’ve been working a bit on Marky the Markdownifier. It’s a project I started back in 2010 and use regularly, but it’s never really caught on with the Markdown masses. I’ve tweaked the algorithms and added to the API to make Marky as useful as possible within my own workflow, and hopefully within other’s as well.
At its most simple, Marky takes urls and converts them to Markdown text, removing comments and ads in the process. A web-based version of read2text. In the web interface, you can copy Markdown to the clipboard, preview it as HTML and a few other surface level tasks. You can also go straight to HTML view and use Marky as an Instapaper Mobilizer kind of tool. It can be more useful than that, though.
Starting with Bookmarklets for your browser, you can convert the current page to raw Markdown, open it as a preview, even save it directly to nvALT. Here are some more examples:
- Create a System Service to convert entire folders full of html files to Markdown files
- A System Service to send local html files to NV as Markdown
- Clip web pages to Markdown from Terminal, Launchbar, Alfred and Keyboard Maestro
- Grab all of the pages for links in the clipboard
- Send the full pages to nvALT (or any text-based organization system)
- Pass raw HTML to Marky and get Markdown back
- Markdownify your clipboard contents, use it from shell scripts to convert text anywhere
- Use the JSON output option to incorporate it into JavaScript-based web applications and browser extensions
- Marky can also do regular Markdown conversion (with Extra extionsions), so you can pass it Markdown text and get HTML back
Everything is done with GET or POST requests to /go/ on your chosen domain. For example, this url returns a string that, when executed by the system, adds the Markdown version of one of my posts to nvALT (or NV, if that’s where the nv://
handler points): http://heckyesmarkdown.com/go/?u=http://brettterpstra.com/bash-completion-for-defaults-domains/&read=1&output=nv. Note that adding to nvALT this way doesn’t crash it the way that dragging urls and files to it generally does (I’m working on that…).
There are details for the API listed on the homepage for each of the two flavors of Marky (folksy and sailor versions). Some ready-made bookmarklets are there (open page in Marky, open as raw Markdown, save Markdown to NV, etc.); you can dissect them to see how they work, and use the API documentation to extend them. It’s not a gorgeous API, but it does the trick.
Here are a couple of Bash functions I use for clipping from the command line:
urlenc () { #url encode the passed string
echo -n "$1" | perl -pe's/([^-_.~A-Za-z0-9])/sprintf("%%%02X", ord($1))/seg'
}
hymd () { # copy raw markdown to clipboard
encurl=$(urlenc $1)
curl "http://heckyesmarkdown.com/go/?read=1&preview=0&showframe=0&u=$encurl"|pbcopy
echo "In your clipboard, man."
}
hynv () { # add raw markdown to nvALT
encurl=$(urlenc $1)
open $(curl "http://heckyesmarkdown.com/go/?read=1&preview=0&showframe=0&output=nv&u=$encurl")
}
I’ve created a Ruby script that handles just about every combination of tasks you’d want to run on your local system. It can be chopped up and turned into Services or actions for your favorite launcher. I’m including a couple of example Services created that way below. Make it executable and run marky.rb -h
for a list of options:
Usage: marky.rb [-o OUTPUT_PATH] -f TYPE [-t TYPE] input1 [input2, ...]
-o, --output DIR Output folder, default STDOUT. Use "." for current folder, void if output type is "nv"
-f, --from TYPE Input type (html, htmlfile, url [default], bookmark, webarchive, webarchiveurl)
-t, --to TYPE Output type (md [default], nv)
-h, --help Display this screen
I use Marky primarily for clipping reference web pages, but I’ve tested it with all kinds of sites and it works well about 90% of the time. Some sites with bizarre markup don’t have their content recognized by the Readability algorithm, and you get blank pages (New York Times), and some sites don’t provide HTML markup for the article in the source (Gawker), so there’s nothing to grab. I had a workaround for Lifehacker and friends, but it broke recently. I’m working to update it.
I’ll be updating and making changes for a while, so I can’t guarantee 100% uptime at the moment. Also, it’s hosted on a shared Dreamhost account, so it’s not currently set up to scale to any extreme. Caveat emptor. If you have ideas for extending Marky to be more useful, let me know. In the meantime, feel free to start Markdownifying the web.