Exporting Safari Reading List to Pinboard and/or Markdown

I’ve heard from a few people that this script might be useful to them. I’ve also created an “app” version of it you can run, but it still takes a wee bit of command line setup. It’s pretty simple, though, and I’ll cover that in a minute.

What the script does is parse your Safari Reading List bookmarks directly from the PLIST file that Safari stores in your support folder. Safari doesn’t technically need to be open to do this, but it will only sync your latest bookmarks from other devices when it’s running. The app version of this script will launch Safari automatically, but you may want to add a command to any launchd setups directly using the script version to do so.

The date of each run of the script is recorded and only newer bookmarks are pulled the next time. If posting to Pinboard, bookmarks are marked with ‘toread’ as well as a ‘.readinglist’ tag (a private tag you can use for sorting and cleanup).

Setup

First, regardless of which version you run, you need to install the Python “pinboard” library, which can be done with:

easy_install pinboard

or pip install pinboard if you use pip. I don’t recall my setup having any trouble with permissions, but if you get an error, try using sudo easy_install pinboard (or sudo pip install...).

Configuration

Then there are just a couple of config settings that are currently hardcoded in the script/workflow. To edit the script, just open it in a text editor, to edit the workflow, right click and Open In… Automator.

Look for the block of text containing:

DEFAULT_EXPORT_TYPE = 'pb'
PINBOARD_API_KEY = 'XXXXXXX:XXXXXXXXXXXXXXXXXXXX'
BOOKMARKS_MARKDOWN_FILE = '~/Dropbox/Reading List Bookmarks.markdown'
BOOKMARKS_PLIST = '~/Library/Safari/Bookmarks.plist'

DEFAULT_EXPORT_TYPE can be set to “pb” (Pinboard), “md” (Markdown list), or “all” (exports both)
PINBOARD_API_KEY is your full Pinboard API Key.
BOOKMARKS_MARKDOWN_FILE can be any path (including filename) for an existing Markdown file
You shouldn’t need to modify BOOKMARKS_PLIST

That’s it.

Download

Here’s the script version, and you can grab the Automator Workflow below. Note that this is the first time I’ve ever published a Python script and I’m still feeling my way around building CLIs with it. I’m open to your thoughtful criticism.

ReadingListCatcher.py raw

#!/usr/bin/python
# ReadingListCatcher
# - A script for exporting Safari Reading List items to Markdown and Pinboard
#   Brett Terpstra 2015
# Uses code from <https://gist.github.com/robmathers/5995026>
# Requires Python pinboard lib for Pinboard.in import:
#     `easy_install pinboard` or `pip install pinboard`
import plistlib
from shutil import copy
import subprocess
import os
from tempfile import gettempdir
import sys
import atexit
import re
import time
from datetime import date, datetime, timedelta
from os import path
import pytz

DEFAULT_EXPORT_TYPE = 'pb' # pb, md or all
PINBOARD_API_KEY = 'XXXXXXX:XXXXXXXXXXXXXXXXXXXX' # https://pinboard.in/settings/password
BOOKMARKS_MARKDOWN_FILE = '~/Dropbox/Reading List Bookmarks.markdown' # Markdown file if using md export
BOOKMARKS_PLIST = '~/Library/Safari/Bookmarks.plist' # Shouldn't need to modify

bookmarksFile = os.path.expanduser(BOOKMARKS_PLIST)
markdownFile = os.path.expanduser(BOOKMARKS_MARKDOWN_FILE)

# Make a copy of the bookmarks and convert it from a binary plist to text
tempDirectory = gettempdir()
copy(bookmarksFile, tempDirectory)
bookmarksFileCopy = os.path.join(tempDirectory, os.path.basename(bookmarksFile))

def removeTempFile():
    os.remove(bookmarksFileCopy)

atexit.register(removeTempFile) # Delete the temp file when the script finishes

class _readingList():
    def __init__(self, exportType):

        self.postedCount = 0
        self.exportType = exportType

        if self.exportType == 'pb':
            import pinboard
            self.pb = pinboard.Pinboard(PINBOARD_API_KEY)

        converted = subprocess.call(['plutil', '-convert', 'xml1', bookmarksFileCopy])

        if converted != 0:
            print 'Couldn\'t convert bookmarks plist from xml format'
            sys.exit(converted)

        plist = plistlib.readPlist(bookmarksFileCopy)
         # There should only be one Reading List item, so take the first one
        readingList = [item for item in plist['Children'] if 'Title' in item and item['Title'] == 'com.apple.ReadingList'][0]

        if self.exportType == 'pb':
            lastRLBookmark = self.pb.posts.recent(tag='.readinglist', count=1)
            last = lastRLBookmark['date']
        else:
            self.content = ''
            self.newcontent = ''
            # last = time.strptime((datetime.now() - timedelta(days = 1)).strftime('%c'))
            last = time.strptime("2013-01-01 00:00:00 UTC", '%Y-%m-%d %H:%M:%S UTC')

            if not os.path.exists(markdownFile):
                open(markdownFile, 'a').close()
            else:
                with open (markdownFile, 'r') as mdInput:
                    self.content = mdInput.read()
                    matchLast = re.search(re.compile('(?m)^Updated: (\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} UTC)'), self.content)
                    if matchLast != None:
                        last = time.strptime(matchLast.group(1), '%Y-%m-%d %H:%M:%S UTC')

            last = datetime(*last[:6])

            rx = re.compile("(?m)^Updated: (\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) UTC")
            self.content = re.sub(rx,'',self.content).strip()

        if 'Children' in readingList:
            cleanRx = re.compile("[\|\`\:_\*\n]")
            for item in readingList['Children']:
                if item['ReadingList']['DateAdded'] > last:
                    addtime = pytz.utc.localize(item['ReadingList']['DateAdded']).strftime('%c')
                    title = re.sub(cleanRx, ' ', item['URIDictionary']['title'].encode('utf8'))
                    title = re.sub(' +', ' ', title)
                    url = item['URLString']
                    description = ''

                    if 'PreviewText' in item['ReadingList']:
                        description = item['ReadingList']['PreviewText'].encode('utf8')
                        description = re.sub(cleanRx, ' ', description)
                        description = re.sub(' +', ' ', description)

                    if self.exportType == 'md':
                        self.itemToMarkdown(addtime, title, url, description)
                    else:
                        self.itemToPinboard(title, url, description)
                else:
                    break

        pluralized = 'bookmarks' if self.postedCount > 1 else 'bookmark'
        if self.exportType == 'pb':
            if self.postedCount > 0:
                sys.stdout.write('Added ' + str(self.postedCount) + ' new ' + pluralized + ' to Pinboard')
            else:
                sys.stdout.write('No new bookmarks found in Reading List')
        else:
            mdHandle = open(markdownFile, 'w')
            mdHandle.write('Updated: ' + datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S') + " UTC\n\n")
            mdHandle.write(self.newcontent + self.content)
            mdHandle.close()
            if self.postedCount > 0:
                sys.stdout.write('Added ' + str(self.postedCount) + ' new ' + pluralized + ' to ' + markdownFile)
            else:
                sys.stdout.write('No new bookmarks found in Reading List')

        sys.stdout.write("\n")

    def itemToMarkdown(self, addtime, title, url, description):
        self.newcontent += '- [' + title + '](' + url + ' "Added on ' + addtime + '")'
        if not description == '':
            self.newcontent += "\n\n    > " + description
        self.newcontent += "\n\n"
        self.postedCount += 1

    def itemToPinboard(self, title, url, description):
        suggestions = self.pb.posts.suggest(url=url)
        tags = suggestions[0]['popular']
        tags.append('.readinglist')

        self.pb.posts.add(url=url, description=title, \
                extended=description, tags=tags, shared=False, \
                toread=True)
        self.postedCount += 1

if __name__ == "__main__":
    exportTypes = []
    if len(sys.argv):
        for arg in sys.argv:
            if re.match("^(md|pb|all)$",arg) and exportTypes.count(arg) == 0:
                exportTypes.append(arg)
    else:
        exportTypes.append(DEFAULT_EXPORT_TYPE)

    for eType in exportTypes:
        _readingList(eType)

ReadingListCatcher v1.0.0

Download ReadingListCatcher v1.0.0

A workflow and script for saving Safari Reading List bookmarks to Pinboard and/or Markdown

Published 01/06/15.

Updated 01/06/15. Changelog

Donate • More info…

bookmarking, markdown, pinboard, safari

Setup

Configuration

Download

ReadingListCatcher v1.0.0

Join the conversation