Nerdy Adventures of Wes: November 2009

I just submitted this to IT World. Let's see if they deem it worth publishing.

IPv6 has been an approved standard for over a decade now. But its adoption has suffered from lack of a compelling reason to deploy it. Since the future scalability of the Internet depends on our adopting it soon, we should be on the lookout for so-called “killer applications” where its benefits over IPv4 can help us solve real world problems today. That will help get IPv6 rolled out before we run out of IPv4 addresses and experience all the problems that will cause. It is similar to incentivizing the use of renewable energy before cheap fossil fuels run out so the transition is less painful for all involved. This article discusses one such killer app for IPv6: Infrastructure-as-a-service cloud computing systems such as Amazon’s EC2.

Cloud computing solves many of the problems of maintaining and growing a complex server infrastructure. But it does so in a way that introduces a few new ones. One of the biggest challenges is IP addressing. IPv4, the underlying protocol powering the Internet as we know it today, as well as just about every other computer network we use on a daily basis, was not designed to accommodate the elastic nature of the cloud. Server instances need a way to find each other on an IP network, but it is not practical to assign every client account a static, publically routable IPv4 subnet, especially as they become scarcer. Depending on the cloud provider’s architecture, it is also sometimes advantageous not to use the public IPv4 address even when there is one available (due to performance and/or bandwidth cost considerations). In order to demonstrate how IPv6 can help with this, let’s start with an example scenario of how it works now on EC2.

Amazon’s Elastic Compute Cloud (EC2) service is undeniably the leader in infrastructure-as-a-service cloud computing options. When you launch an instance, it is dynamically assigned an IPv4 address in the 10.0.0.0/8 private subnet. If Amazon is actually using a subset of that class A block, they’re not saying. So system administrators have to assume the entire 10.0.0.0/8 subnet is off limits for anything else (such as VPN subnets). A routable, public IPv4 address is also assigned so you can access the instance from outside Amazon’s network in a 1:1 NAT configuration. You can optionally request that this come from a pool of static addresses reserved for your use only. Amazon calls these Elastic IPs and charges for reserving them (though not while they’re actively in use). It may be tempting to think assigning an Elastic IP to each of your instances is a good solution, but there’s a downside.

When you access an EC2 instance from another instance (for example, when your PHP-enabled web server wants to contact your MySQL database server), it’s a good idea to use the 10.0.0.0/8 address whenever possible. This is because throughput will be higher (it’s a more direct route) and Amazon doesn’t charge for the bandwidth you use if the two instances are in the same data center (Amazon calls these Availability Zones). So Elastic IPs don’t help here, because they only apply to the public-facing side. However, it’s very difficult to coordinate which instance has what private IP since they are dynamically assigned at boot. A robust cloud infrastructure should always be designed so that individual instances can be thrown away (or lost) with little to no downtime of the application(s) being served. This means instances will be coming up and down all the time, so your application configurations cannot have hard-coded DNS entries or IPv4 addresses in them.

There are 3^rd party services that can help with this. Rightscale, a cloud computing control panel service, recommends the use of DNS Made Easy. This service allows you to register dynamic DNS entries and then have your instances register their IPv4 addresses with those DNS entries when they boot up. This is a workable, if clumsy, solution. It gets trickier when you need to access these servers from outside Amazon’s network, since you cannot route packets to the 10.0.0.0/8 address over the public Internet or even a private VPN (unless you’re willing to route the entire 10.0.0.0/8 subnet over the VPN, but that will almost certainly cause you problems down the road when trying to connect to other private networks using that subnet, and there are many).

Wouldn’t it be nice if we had a networking protocol that was designed to work in this kind of environment? And wouldn’t it be even better if that protocol were the one poised to run the entire Internet in a few years anyway? Oh hi, IPv6. How long were you standing there? This is awkward…

The main reason Amazon has to charge for reserved Elastic IP addresses is because we’re running out of IPv4 addresses. Current estimates say that we will start experiencing the effects of IPv4 address exhaustion in 2010, and will almost certainly be entirely out of addresses by 2012. IPv6 on the other hand, has more addresses than we could ever hope to use up. This is because IPv6 addresses are 128 bits long, as opposed to 32 bits for IPv4 addresses. 32 bits gives you around 4 billion addresses. 128 bits gets you somewhere in the neighborhood of 4.5x10¹⁵ addresses for every observable star in the known universe. That’s a lot.

But the beauty of IPv6 isn’t just that there are more addresses, it’s also in how they are assigned. Current recommendations say that each subscriber to an IPv6 network should be assigned an entire /48 prefix. This means you have 16 bits to use for further subnetting and then 64 bits still left over for assigning addresses to your hosts. Every single one of the 65,000 or so subnets you’re given can hold the square of the entire existing 32-bit addressable IPv4 Internet. Whoa. And if you move your entire deployment to a new cloud hosting provider, only the 48-bit assigned prefix changes. Your subnets and host address assignments stay the same, and the IPv6 routing protocol is designed to inform everyone of the new home of your network so everything just keeps working.

Here’s how this could play out for EC2 if Amazon decided to roll out IPv6:

Each EC2 client gets a /48 prefix.
When you launch an instance, you can either allow dynamic autoconfiguration of the IPv6 address or you can specify the subnet and host address you want to use. This allows you to use the same static IP addresses for instances that are fulfilling the same role in your deployment. For example, if you are deploying a new database server instance, you can just assign its static version 6 IP to the new instance. (Ideally there would also be an option to assign / migrate IPv6 addresses after instances are running. This would allow you to sync up your new database to the old one before pointing your app servers at it, for example.)
Any of these addresses are publically routable and accessible on the IPv6 Internet (but of course you can limit this w/ firewalling, presumably via an IPv6-capable version of Amazon’s security groups).
You can then use regular old static DNS because your IP addresses are under your control once again. Just assign AAAA records for the IPv6 addresses you’re using.
When administering your instances, just connect to their static IPv6 addresses.
Configure your software in the cloud to connect to the other instances via IPv6 using the AAAA DNS hostnames.
There would be no private IPs vs. public IPs. You use the same addresses and DNS hostnames everywhere, and Amazon would waive the bandwidth charges if you stay inside your 48-bit prefix. You would no longer have to jump through hoops to get your software to use 10.0.0.0/8 addresses sometimes and publicly-routable addresses at other times.
Amazon could still assign dynamic IPv4 addresses the same way they do now for legacy software that doesn’t yet support IPv6. This would also allow you to phase in IPv6.
Amazon could still provide Elastic IPs for servers that need public IPv4 accessibility. In a typical web app configuration, this would just be the front-end load balancers.
As the IPv6 Internet gets rolled out, your web app is already future-proofed because it has AAAA records in DNS and routable IPv6 addresses.

Cloud computing can benefit from IPv6 today, and since we need to move to IPv6 very soon anyway, it’s a win-win situation. Luckily Amazon has noticed this too (see the “Why don’t you use IPV6 addresses?” question on this page: http://docs.amazonwebservices.com/AWSEC2/2009-08-15/DeveloperGuide/index.html?IP_Information.html), and they say they are investigating it. Here’s hoping they have something to announce in the next few months. Time’s a-ticking.

Nerdy Adventures of Wes

Sunday, November 22, 2009

IPv6's Killer App (Finally): Cloud Computing

Followers

Blog Archive

About Me