Archive for August, 2008

Bloody hosting providers

Friday, August 29th, 2008

Choosing a hosting provider can be a difficult process, especially when you’re on a budget.

I’m heavily involved with ewelike, a product information and price-comparison site that we’re slowly improving as we integrate more price and information feeds.

Of course you can’t actually see ewelike right now, as our hosting provider’s been trying to commit hari kari for the last 3 days.

On-demand computing? Not quite.

Update: 2008-09-01 ewelike came back to life on Friday night. Aside from some recoverable table-corruption, I think we’ve emerged relatively unscathed - and possibly a little bit wiser.

Decisions on a shoe-string

We didn’t have the option of cloud-computing when we started, and we definitely didn’t have the venture capital to buy our own hardware. Our choice was between shared-hosting and private-servers.

We chose shared-hosting with dreamhost. Compared to service providers in England, dreamhost offered an enticing package with plenty of storage and bandwidth included. We’d got a good feel about them as we knew plenty of designers, developers and bloggers that used or recommended them.

Setup was simple, and we were pleasantly surprised by the freedom offered for a shared-hosting platform - they even allowed users to install custom-compiled copies of PHP to run on.

Sadly, we suffered a few reliability issues, and we never had any guarantee of performance. Once our database started to grow and performance dropped, we realized we need to look around at someone else (at this point, we’d had too many reliability issues to give dreamhost’s private servers a go.)

There were so many options for private-servers that we suffered a touch of analysis-paralysis. Should we rent or buy? Build our own server, or let someone else do it? Which flavor of Linux should we go for?

And then cloud-computing took off.

Cloud computing

After looking at various grid services, we narrowed our options to Amazon’s Elastic Compute Cloud and Flexiscale. Running a price-comparison site using Amazon’s platform just felt wrong - including Amazon is price-comparison is almost mandatory, but storing their competitor’s prices, product images and information within Amazon’s data-centers? That’s just weird.

The FlexiScale service is provided by xcalibre, a UK-based company I use to host the hexmen site, run by a team I’ve found supportive and responsive in the past. The most enticing things about FlexiScale were fast setup times, low startup costs, and simple methods for improving (virtual) server performance - with straight-forward billing costs.

We’ve been running on a Flexiscale for a few months without any problems. Until a few days ago.

Your risk or mine?

When Amazon’s EC2 service had an outage earlier this year, I asked a friend “would you rather deal with regular outages yourself, or put-up with occasional problems completely out of your control?”

We’d been thinking about hosting something ourselves: our own PCs in our own homes, using our own broadband connections. The hardware would probably be the most reliable piece of the puzzle, and the broadband the least. In fact, we worked on the assumption that we’d probably be missing for an hour or so every week, with the occasional issue where we’d be gone for a few hours (and ranting down the phone at our ISP.)

The risks of home-hosting seemed too great. Much better to go with a professional outfit with monitoring and procedures in place to resolve the inevitable problems as quickly and painlessly as possible.

Misery

Ewelike’s been down 3 days now. Three days! I know we’ve still got a lot of work to do to build-up traffic and a loyal user-base, and that’s probably a saving grace at this point. I’d hate to be clock-watching thinking “that’s another thousand pounds lost… And another…”.

I’m going to crawl off and figure out how to handle fail-over in future. Do we need to go so far as having the app ready to run on two different clouds? Would we be paranoid enough to have primary and secondary name-servers pointing to different services, just in-case one goes down?

It’s time to stop wallowing and crack on with things - but first I’ll get my English on and make a nice cup of tea. That’ll fix it.

Getting git to work on OS X Tiger

Friday, August 22nd, 2008

If you haven’t heard of git yet, it’s quickly becoming the preferred version-control system for tons of open-source projects, including the twin suns of ruby on rails and prototype.

In fact, if you keep your eye on the github blog you’ll see a steady stream of well-known projects moving over to git, as diverse as the Blueprint CSS framework and the Haskell compiler.

Basically, if git was a stock-market commodity, analysts would be issuing strong buy recommendations left, right and centre. Git’s tipping-point has arrived.

How to play

If you’ve arrived here via search-engine, it’s probably because you’re trying to work around errors like Can’t locate Error.pm or Can’t locate SVN/Core.pm. Read on…

I already had macports installed, but if you haven’t, follow the macports install instructions - we’ll be using macports to download and install git as it’s supposed to be simpler than building from source.

If you’ve had macports installed a while, make sure it’s up to date:


$ sudo port selfupdate

We want to use git to connect to subversion repositories as well, so we’ll just check that’s possible:


$ port list variant:svn
git-core        @1.6.0  devel/git-core
subversion      @1.5.1  devel/subversion

I already had subversion installed but through trial-and-error found I needed to reinstall it with perl-bindings (git must be using perl scripts to talk to subversion…) Note: I’m using the -f flag to force it to reinstall, you might want to try without first, just to see what conflicts it brings up:


$ sudo port uninstall -f subversion-perlbindings
$ sudo port install -f subversion-perlbindings

Next, we install git:


# This may take a while to install with all its dependencies:
$ sudo port install git-core +svn

And finally, we check it works:


$ mkdir myproject; cd myproject;

# Check your PATH's set properly, this should output:
# fatal: Not a git repository
$ git svn

# If that's OK... clone a repository:
$ git svn clone http://example.com/svn/project/trunk

Can’t locate Error.pm

If you’re getting Can’t locate Error.pm or Can’t locate SVN/Core.pm you should immediately try:


$ PATH=/opt/local/bin:$PATH git svn

If that works, you know it’s just a PATH problem. It’s something to do with Apple’s perl install having slightly kooky ideas about where to store perl libraries.

If you’re still getting complaints about Error.pm, you need to install the CPAN module - and we’re going to use the /opt/local/bin instance of cpan, to make sure things go in the right place for us:


$ sudo /opt/local/bin/cpan -i lib::Error

Cross your fingers, and try again:


$ PATH=/opt/local/bin:$PATH
$ git svn clone http://example.com/svn/project/trunk

If things are working, git will spend a while cloning the subversion repository by pulling out every single revision so you can have a complete set of revisions (including deltas), ready for you to refer to with lightning-speed regardless of internet connectivity. Which is nice.

PHP Session Management (grievance 2)

Wednesday, August 20th, 2008

Sometimes PHP surprises you with an easy-to-use feature, like sessions.

Sessions are quite easy to use in PHP. One call to @session_start(), and you have a magic global called $_SESSION to store data in; associated with the user using a cookie called PHPSESSID. PHP takes care of reading and writing the session data for you, and you think no more about it.

Simple.

Time passes, and you haven’t given sessions another thought. Your site’s evolving, using more and more AJAX, and seems to be performing ‘OK’. But, there’s a niggling doubt that something’s not quite right.

For us, we realized something was wrong when we opened multiple search-results in separate windows. We could see the tabs were loading one by one, slowly.

I guess we should have paid more attention to start with. Our previous web development background revolved around enterprise-class application servers. Sessions just worked, no concurrency worries. If you happened to run into a race-condition, you worked around it using threading and locking facilities provided by the implementation language. It never occurred to us that PHP would be so different.

PHP, the way we’re running it (via mod_php) couldn’t be further from the application-server model if it tried. (By default) sessions are implemented using file-based storage, not held in shared memory ready for use by multiple threads.

Storing sessions in files means PHP has to take heavy-handed precautions against concurrent read/write access to the session - it locks the session file for the duration of a request.

The idea never occurred to us - that session management would block user-requests, stopping concurrent requests completing (think AJAX.) Fortunately the quick-fix solution is simple: call session_write_close() as soon as you’ve finished writing to the session. Depending how you use sessions, you may find a number of actions only need read-access to the session, in which case you may want to open and close the session together: @session_start(); session_write_close()

That’s the quick fix, but there are plenty of other options to explore to. A quick code-audit could identify a ton of actions, controllers and pages that simply don’t need session access at all. Now you know PHP locks the session file, you probably want to avoid calling session_start() unless absolutely necessary.

Secondly, PHP allows you to choose what type of session-management you use. You can use memcached either on its own, or with a database backing-store. You could use a MySQL back-end, or roll your own session management registered using session_set_save_handler. It’s really up to you.

Perhaps that’s the problem right there. All the session-management hooks are there because the default session management sucks. The simplicity of using sessions lulls you into a false sense of security, but make no mistake - sessions need to be handled with care if you’ve any hope of running a high-volume website.

Are your sessions managed properly?

SVN log message encoding problem

Friday, August 8th, 2008

It’s good practice to put useful commentary in the log message whenever you commit code to a repository.

Today, I wrote a log message about centigrade and farenheit conversions, using the proper degree symbol °, but this triggered an encoding problem, resulting in an error message:

macbook:~/projects/smarty ash$ svn ci plugins/function.temperature.php
svn: Commit failed (details follow):
svn: Can't convert string from native encoding to 'UTF-8':
svn: Tweak: altered temperature title attribute so it contains both farenheight AND
centigrade.  e.g. "88?\194?\176F or 17?\194?\176C".  The order is switched depending on user
preference.
--This line, and those below, will be ignored--

M    function.temperature.php

svn: Your commit message was left in a temporary file:
svn:    '/Users/ash/projects/propagandr/smarty/plugins/svn-commit.tmp'

It didn’t take long to realize although my editor (vim) was configured to use UTF-8, the subversion command-line client had no way of knowing that.

One way of stopping this happening again would be to set my locale permanently so the character-type is UTF-8 (e.g. export LC_CTYPE=en.UTF-8.) But, as a short-term one-off fix, avoiding retyping the log message (and a little off-topic: remembering subversion ignores filenames mentioned in log messages, forcing you to reenter them on the command-line again) - the simple fix was:

ash$ LC_CTYPE=en_GB.UTF-8 svn ci -F plugins/svn-commit.tmp plugins/function.temperature.php

Worked like a charm.