Gather is constantly improving at this point, and a lot of its rough edges have already been polished. I’m loving all the feedback I’m getting, both for fixing bugs and for adding new features. It’s hard to test something like Gather on every possible permutation of a web layout, so I appreciate hearing about edge cases (even if I do have to weigh how much effort they’re worth).
I’d recommend upgrading to the latest version before trying anything in this post. As of this writing, that’s v2.0.33. If you installed via Homebrew, you’re just a brew upgrade gather-cli away. Otherwise, please download the PKG file and update.
If you have Homebrew installed but ran into errors trying to install gather-cli, try running sudo xcode-select -s /Applications/Xcode.app/Contents/Developer. It fixes 90% of the issues I’ve seen.
Pesky Paywalls
So anyway, one common thing that people want to clip from the web is paywalled content, which makes sense as it’s harder to search and more convenient to have in your own personal, local notes. But Gather can’t access a web page that isn’t public. You can select all the text, copy it, then run Gather with gather --paste --html to convert it, but it’s never perfect. The better solution is bookmarklets.
I should mention that this is made possible by a feature I added since the last time I wrote1. It’s a URL template feature that lets you generate a URL handler for any app. I had built in the x-nvultra handler and was looking for others that might be worth adding when I realized I could make any of them possible (and keep the number of menu items down) with one command. So now you can use --url-template "myhandler://new_note?text=%text&title=%title" to create a url for any handler you want. Add --url-open to have the URL executed immediately instead of returned. The placeholder contents are completely URL encoded (no pesky “safe” characters that throw some apps off), and are comprised of:
%title: The title of the page
%text: The markdown text of the page
%notebook: The contents of the --nvu-notebook option, can be used for additional meta in another key
%source: The canonical URL of the captured page, if available
%date: Today’s date and time in the format YYYY-mm-dd HH:MM
%filename: The title of the page sanitized for use as a file name
%slug: The title of the page lowercased, all punctuation and spaces replaced with dashes (using-gather-as-a-web-clipper)
So that’s how I build these bookmarklets (using the shortcuts://run-shortcut?name= url handler). These are still experimental, but should give you some ideas about the kind of stuff you can do with Gather.
Gather: The Magicking
The first bookmarklet, simply titled “Gather,” will grab the full visible HTML of the current page (what’s actually rendered in the browser, so it works on paywalled sites and sites like Medium that dynamically load content), cleans it up a little, and sends to to a Shortcut using the Shortcuts url handler. Then your Shortcut can do whatever you want with it. In my case, it accepts the URL content on STDIN and passes it to Gather which Markdownifies it and sends it to nvUltra to create a new note. But you can do anything you want. Keep in mind when writing shortcuts for these that all you need is --stdin --html and Gather will accept raw HTML as input from the Shortcut.
Here’s the bookmarklet, which you can just drag to your bookmarks toolbar. It’s tested in Chrome, Safari, and Firefox.
The Gather bookmarklet points to a Shortcut named “Gather.” You can always edit the bookmarklet to point to another name if you want to — it’s assigned near the beginning of the code. But to start you can just add a new Shortcut named “Gather” to deal with the input.
I’m including a couple of example Shortcuts below, and the Gather one shows a menu with “Copy to clipboard,” “Add to nvUltra,” “Save to file,” and “Preview in Marked.” All of the gather commands needed for these actions are demonstrated.
When you click the bookmarklet while a page you want to clip is fully loaded, it will send the contents for markdownification to the Gather Shortcut.
The other bookmarklet I’ll share is definitely a work in progress, but is working pretty well at this point. I usually load my bookmarklets from a server so that I can just update the server copy and all of my browsers (and anyone with the bookmarklet installed) get the latest version. However, recent security changes in most browsers now eliminate the option of loading external scripts by injecting script tags into the DOM, so my bookmarklets fail on about 50% of websites. So these are fully encapsulated in the bookmark url, no external dependencies. That also meant I had to rewrite this one to use vanilla JS instead of jQuery, but I think I got it there.
Anyway, this is BullseyeGather (nee Bullseye). When clicked, red boxes appear under your mouse as you move around on the current page. Clicking in a box will send just the content from the innermost highlighted area to a Shortcut called “Bullseye Handler.” It’s a little quirky depending on page markup, but I’ll keep working out the kinks as I have time. Because you’re hand selecting which content from the page is sent, you could run Gather with --no-readability, but I’ve had better luck selecting a slightly larger area than I need and letting Readability handle cutting out garbage. 80% success rate.
Hope that illustrates some of the ways Gather is designed to fit into your workflows. I know not everybody spends as much time as I do building up a library of Markdown-based reference material, but I have proof that I’m not the only one, thanks to the number of email conversations I’ve had with curious users.
I know, that was just like two days ago. I’m a wee bit manic right now. Obviously. ↩