wp_mangler, scripted post modification for WordPress
Here’s a quick script I was playing around with this morning. I want to do some extensive string handling on all of the posts in my WordPress database. It made the most sense to me to pull each post’s content from the db, do the string mangling with a shell script and then replace the post_content field as I go. This script is destructive and any mistakes you introduce could kill your entire blog. Should you have a need for this script, back up first!
The script itself is really simple: it just pulls all of the post content and ids from the blog database, runs each one through your processing script and then sticks the result back in the post. The handler is empty here, it’s assumed that you’ll edit the function to call your own script to suit your needs.
wp_mangler:
#!/usr/bin/env ruby
# Requires 'sequel' and 'mysql' rubygems
require 'rubygems'
require 'sequel'
require 'cgi'
def e_sql(str)
str.to_s.gsub(/(?=[\\])/, "\\")
end
def process_post(input)
# do your thing
input
end
def process_db(db_name = '', user='', pass='', host = 'localhost', db_prefix = '')
db = Sequel.mysql(db_name, :user => user, :password => pass, :host => host)
db["select post_content, ID from wp_#{db_prefix}posts where post_status = 'publish' and post_type = 'post'"].each do |post|
content = post[:post_content]
content = e_sql(process_post(content)) # run process_post on post_content, escape result
update_ds = db["UPDATE wp_#{db_prefix}posts SET post_content = ? WHERE ID = ?",content,post[:ID]]
update_ds.update # replace the existing content with the new content
end
end
# process_db (dbname, username, password, host, dbprefix)
process_db('database','user','pass','localhost','tableprefix_')
You’ll need to edit the process_db
call in the last line with your mySQL database credentials and host info.
As an example, here’s a function you can put into the above script that will find all your inline links and turn them into reference links at the bottom of the post. Just for fun.
def e_sh(str)
str.to_s.gsub(/(?=[^a-zA-Z0-9_.\/\-\x7F-\xFF\n])/n, '\\').gsub(/\n/, "'\n'").sub(/^$/, "''")
end
def process_post(input)
links = input.scan(/\((https?:\/\/([^\)]+))\)/)
refs = input.scan(/^\[([^^][^\]]+)\]: (.*)$/)
lines = input.split("\n")
bottom = lines[0..-1].join("\n").gsub(/^\[([^^][^\]]+)\]: .*\n?/,'')
norepeat = []
norepeatlinks = []
output = []
refs.each {|ref|
name = ref[0]
next if norepeatlinks.include? ref[1]
while norepeat.include? name
if name =~ / ?[0-9]$/
name.next!
else
name = name + " 2"
end
end
output << {'orig' => ref[0], 'title' => name, 'link' => ref[1]}
norepeat.push name
norepeatlinks.push ref[1]
}
links.each {|url|
next if norepeatlinks.include? url[0]
domain = url[0].match(/https?:\/\/([^\/]+)/)
parts = domain[1].split('.')
name = case parts.length
when 1 then parts[0]
when 2 then parts[0]
else parts[1]
end
while norepeat.include? name
if name =~ / ?[0-9]$/
name.next!
else
name = name + " 2"
end
end
output << {'orig' => url[0], 'title' => name, 'link' => url[0] }
norepeat.push name
norepeatlinks.push url[0]
}
output = output.sort {|a,b| a['title'] <=> b['title']}
o = []
output.each_with_index { |x,i|
o.push("[#{x['title']}]: #{x['link']}")
bottom = bottom.gsub(/\((#{e_sh x['orig']}|#{e_sh x['link']})\)/,"[#{x['title']}]").gsub(/\[#{e_sh x['orig']}\]/,"[#{x['title']}]")
}
bottom + "\n#{o.join("\n")}\n"
end
The script requires the ‘sequel’ and ‘mysql’ Ruby gems to run. I was able to install these on a Dreamhost server with no problem, but I’d recommend just dumping your database to a local mysql server, running the script and testing the result, then uploading it. Assuming you have a local testing environment with a MySQL server…
Anyway, doubt it’s of much use to most of you, but figured I’d put it out there.