Everything I've ever written about Systems

December 28, 2007
Where's the best place to host DNS?

Anyone have any experience of dedicated DNS hosting providers? I'm looking for a host with servers in multiple countries, with an access panel that gives me full control. Shouldn't be too hard, but after 15 minutes of googling I haven't found anyone who I've heard of before.

December 14, 2007
Simple DB: the final piece of the puzzle falls into place

Amazon just announced "SimpleDB", which sounds a lot like the rumored "SDS" or "Simple Database Service" that we've all been waiting for.

This is huge: the single biggest thing stopping you from running a webapp on EC2 is the fact that there's nowhere safe for your database to live. EC2 is a virtual hosting service, so if a machine crashes and is rebooted, any data written to the hard drive simply disappears. Not good. As a result, EC2 was framed as a great solution for back-end processing (think transcoding videos for youtube), but not a great fit for an entire web application.

Solutions (including backing up your database continually to S3), for this problem never were very convincing. But it was always clear that SOME major initiative that would solve this problem was planned.

Now we know. This isn't a vanilla mysql clustering service: it's something a little weirder (it's conceptually similar to a database, but lacks many of the features of a database, and works somewhat differently). As a result, you'll have to build your app from the ground up as an Amazon app: this isn't a drop-in replacement for mysql cluster.

But the benefits are potentially huge. Imagine you're building a facebook application. You could use SimpleDB, EC2, and S3 to provide the backend, and pay very little in infrastructure costs until you actually started getting real traction. Your system would transparently scale (simply add more EC2 nodes as web/app servers as your server load increases), and you would never, ever have to worry about the huge P.I.T.A. (pain in the ass) that is setting up a database cluster, designing schemas for federating data across multiple databases, etc.

There's never been a better time to be a software entrepreneur. Amazon has once again lowered the upfront cost of starting up a new web business, and at the same time dramatically increased the number of use cases that their other services can be used for.

Coverage from techcrunch, and gigaom here. Marcelo Calbucci frames the services as a "directory service rather than a database service".

November 27, 2007
Using a CDN with S3

I've started shopping for a content delivery network for SlideShare. It's a market with pretty opaque pricing: if you're making the jump to using a CDN for the first time it's not easy to get a real sense of what monthly costs will be.

Conceptually, integration between a CDN and Amazon S3 is pretty straightforward. Here's the basic steps:
1) Dedicate a subdomain (say static.slideshare.net) to serving up all the content you want to serve via the delivery network.
2) Make a cname entry in your DNS to tell traffic going to that subdomain to go to your CDN instead
3) Tell the CDN which bucket on amazon S3 you're saving your static content on.

The CDN receives the request for content at a geographically local server (so Europeans hit a node in Europe, Asians hit a node in Asia, etc). The node will first look in it's own (in-memory) cache. If it doesn't find the content that is requested, it will fetch it from S3 and save it so that will have it cached for next time. How long they cache it for is typically configurable, and APIs are typically provided that allow you to flush the cache.

In my investigations so far the following companies have turned up as potential vendors:
Akamai (the biggest company in the space)
Limelight is another big contender (famously used by Youtube and other Web 2.0 video companies)
Panther express is smaller contender. I've had the most conversations with these guys.
Level 3 is interesting in that they've recently announced that they'll be selling CDN bandwidth at normal bandwidth rates. I haven't talked to them yet, but I probably should. ;->
CDNetworks
Internap
Peer1
EdgeCast
If anyone has any other recommendations for vendors I should check out, feel free to reply on this post! Frankly I really wish Amazon would just provide this as a service on top of S3: that way we wouldn't have to change any of our code at all! Unfortunately, it doesn't seem like this going to happen in the near future.

September 07, 2007
Speaking at the StartUp project

I'll be speaking at the startup project at Standford next Wednesday. I'll be speaking about the economics of amazon s3, and how buying bandwidth for your website is like picking a cell phone plan. Should be fun!

April 22, 2007
Rabble's ActiveRecord talk at SVRC Rocked!

Rabble's presentation on ActiveRecord at the Silicon Valley Ruby Conference was the clearest and most coherent explanation of ActiveRecord I've seen to date.

Check it out!

Rabble was previously lead developer at odeo, and is now part of the Yahoo skunkworks team (err.. "semi-autonomous business unit") called Yahoo BrickHouse.

[Update: more cool ruby presentations have been archived by rubyinside. Sweet!]

March 22, 2007
Using virtualization to automate deployment: is it a good idea or not?

As the number of servers needed to run slideshare increases, we are spending more and more of our time simply deploying our software. Each new box has to have a lot of software installed, configured, and tested before it can be hooked up. Scripting common tasks makes things go faster, but doesn’t resolve the fundamental problem, which is that there’s never any way to prove that Server A has the exact same configuration as Server B. This makes troubleshooting tricky, obviously.

One path we’re starting to consider is virtualization. I haven’t heard of this as a common use for virtualization. Typically, people seem to use software like Xen or VMWare to run multiple virtual servers on one physical server, so they can get more use out of existing hardware. We don’t have that problem: all our boxes are in the red! But we would like to be able to roll out new servers reliably, at the push of a button, the way you can make a new instance of an image on Amazon EC2 just by typing a command into your command line.

The way I look at it, the configuration of a machine is valuable intellectual property, and it needs to be captured so that it can be reproduced whenever we need it. Of course there’s a performance penalty: something like 5/10% of CPU will be consumed by the virtualization software, meaning that overall we’ll need more boxes than we would otherwise. But we’ll be able to set up or rebuild boxes faster, and right now that seems more important to me.

Thoughts? Is this a good idea or not? Has anyone used virtualization in this way? Any recommendations on which software to try first? As always, reply in the comments field below.

Also: a special bonus slideshow on virtualization for your reading pleasure!


Index
> Home
> Complete Archive
» AJAX(52)
» AJAX vs. Flash(8)
» B2B user experience(3)
» Collaboration Tools(12)
» Design and Technology(3)
» Entrepreneur / ISV(9)
» Event Announcements(24)
» Flash(12)
» Home Networking(4)
» India Business(6)
» Metrics(1)
» MindCanvas(2)
» Mullet Blog Layout(11)
» On Demand Software(4)
» Personal(12)
» Remote Collaboration(1)
» Rich Internet Apps(9)
» Ruby / Rails(1)
» SlideShare(29)
» Systems(6)
» Talks / Publications(13)
» Uzanto(11)
» Web 2.0(27)
» barcampdelhi(16)
» web2expo(1)
Articles from this category
»Where's the best place to host DNS?
»Simple DB: the final piece of the puzzle falls into place
»Using a CDN with S3
»Speaking at the StartUp project
»Rabble's ActiveRecord talk at SVRC Rocked!
»Using virtualization to automate deployment: is it a good idea or not?


my company: www.uzanto.com
email: jon at uzanto.com