When I was working on Live Mesh at Microsoft, I had the good fortune to meet James Hamilton. James is full of good ideas, many of which are captured in his paper “On Designing and Deploying Internet-Scale Services.” There is a lot of wisdom in those pages (Greg Linden had some thoughts on it), but I’d like to focus in on this snippet in particular:
Design the system to never need human interaction, but understand that rare events will occur where combined failures or unanticipated failures require human interaction.
Continue reading ‘Handling Human Error In the Datacenter’
This is the second article in my three part history about building Audiogalaxy.com. You should probably read the first one first.
I came back from my Christmas break feeling less burnt out. I focused on designing a backend that could handle 100,000 simultaneous Satellites and then started building it. To free me up from working on the client, Michael bought a copy of the Steven’s networking book and started working on a C version of the Satellite core. And David hired Kennon Ballou to help build the next-generation web interface.
The new backend and client went live in April, using my humble website from V1. Traffic started growing steadily, and by the end of May, we had about 3,000 clients connected at peak times. Sometime around the end of July, there was a Napster injunction scare, which pushed us over the 8,000 mark. We released version 0.6 of the client and David’s beautiful new website in September. At that point, our peak load started increasing by thousands of users every week.
Continue reading ‘Users With a Tattoo of Your Logo? Check.’