Archive for July, 2008

MySQL latin1 → utf8 (Wordpress upgrade)

Tuesday, July 8th, 2008

Spurred on by mass hacking, I’ve updated my old (version 2.0.3) Wordpress install to something a little newer.

I decided to err on the side of caution and upgrade a copy of the live DB first - and I’m glad I did. I saw lots of problems with accents and symbols getting munged in the upgrade - well, that’s what I thought.

In reality, the symbols were being screwed in the backup-and-restore process. Words such as “naïve”, £-signs and various typographic quotes were obviously getting messed up through character-encoding issues.

After a bit of sleuthing, I found the output from mysqldump wasn’t valid. The problem was caused by a combination of default connection settings in a .my.cnf, and the fact that the old Wordpress install was storing utf-8 characters inside a latin1 database.

My MySQL default settings are:


[client]
default-character-set=utf8

Because we’re pulling latin1 data over a utf-8 connection, MySQL starts doing character-set conversions, and screws up a bunch of text that it thinks is latin1, but in reality is actually already in utf-8.

The fix for the backup process was to override my default settings with latin1:


$ mysqldump --default-character-set=latin1 --opt -h db.example.com -u user -ppassword schema > db-backup-latin1-20080707.sql

That worked fine, and the next step was reloading it into the DB as utf-8. This required a little bit of string replacement using a command-line utility bundled with MySQL. If you’re going to do this yourself, watch what you type: I somehow typed in “lastin1″ halfway through and lost an hour or so trying to figure out what went wrong. Anyhow, here’s the command-line:


$ replace "CHARSET=latin1" "CHARSET=utf8" "SET NAMES latin1" "SET NAMES utf8" < db-backup-latin1-20080707.sql > db-backup-utf8-20080707.sql

You should now be able to blat / restore / overwrite your DB and ensure all tables are in the appropriate character set, ready for a smooth wordpress upgrade.


$ mysql --opt -h db.example.com -u user -ppassword schema < db-backup-utf8-20080707.sql

In search of an Adobe Reader alternative on Mac

Wednesday, July 2nd, 2008

I’m looking for an alternative to Adobe Reader on OS X after discovering its “Find” function can’t find jack.

As I’m still running Tiger (OS X 10.4) I find Preview to be a tad lacking in usability - if only the ‘Maximize’ button maximized height and width, I’d be happy. But no, it’s a typical Mac-spastic application - maximize makes the application full-height, but doesn’t change the width. If you’re viewing a PDF using “Fit to Width”, this means the text-size stays exactly the same… which is never what I want.

If only Foxit Reader worked on Mac, I’d be using that (it’s a damn nippy and reliable piece of software on Windows.) But sadly it doesn’t - I even downloaded the Linux version, you know, just in case it worked on OS X (being a *nix ‘n all.)

So, I’m currently taking a look at PDF-XChange Viewer which comes in a couple of different versions. They provide a wee comparison chart to help you choose which version of their PDF viewer to download - but guess what format the chart’s provided in. You guessed it: PDF.

That doesn’t give me much confidence…

Updated 7th July 2008

It didn’t take long to realise PDF-XChange doesn’t even work on OS X. I’d been working my way down the “multi-platform” list of PDF viewers at wikipedia assuming “multi-platform” meant Windows and OS X - but it seems someone thinks PDF-XChange is multi-platform simply because it runs on more than one version of Windows.

Finally, I installed Skim from sourceforge. This open-source PDF reader works pretty well. Searching within documents is user-friendly - (by default) a pane on the left shows extracts matching the search term; click-to-navigate brings the document to the right page and highlights the search-term with both background-colour changes and by ringing the terms in red, making them easy to spot.