For a similar discussion of Amazon Web Services, consult Getting Started with the Cloud: Amazon Web Services.
A friend recently reported a conversation he had with one of those wide-eyed, gee-golly developers who's half Techie and half Moonie. When asked what he was working on, the speaker came back with, "cloud cloud cloud cloud cloud cloud," and my friend said, "but, what if ...," to which the speaker replied, "cloud cloud cloud cloud cloud," to which my friend said, "but that won't work because...," to which the developer responded, "cloud cloud cloud cloud cloud" — and so it went. Many use "cloud" as a synonym for "good." Cloud architectures indeed have a lot going for them, but they're not a panacea, and you need to know what you're doing to jump to the cloud successfully.
This article discusses general cloud-related issues and looks specifically at the Amazon and Google cloud architectures. Subsequent articles will be more practical, delving deeply into code that comprises cloud-based applications, but let's start with some background.
What Is The Cloud?
First, what exactly does "cloud computing" mean? The term "cloud" dates to the early days of the Internet; back before domains existed (yes, there was such a time). An email address was essentially a route specified in what was called "bang notation." To send an email, you needed to know the name of every machine between you and the recipient. Here's a particularly nasty example that I pulled out of an old newsgroup post:
The "dog" and "msb" are the sending and receiving machines. The rest is the route from one machine to the other. The routes didn't have to be fully specified — most email systems knew about a handful of major hubs, so a minimum-length address just specified a route from that hub to you — but there was zero flexibility.
Things changed with the introduction of domains. Instead of an explicit route, you sent an email to a gateway machine, and the email's recipient got the mail from a different gateway machine. The servers through which the mail passed on the way from one gateway to the other were anonymous, and the network topology was unknown. The word "cloud" was coined to describe that amorphous network. You didn't know what went on inside the cloud (and believe me, you didn't want to). As long as the mail ended up at the right place, everything was copacetic.
So, here's my rather strict definition of "cloud": A network of computers in an unknown topology, arranged in such a way that you don't need to know anything about this network except how to talk to a machine at the edge.
A "cloud application" is then an application deployed to the cloud itself, not to a specific machine. The application could be running on one or more machines that may or may not be physically collocated. Its data store could also be distributed, and may not be on the same machine as the application.
Web 2.0 applications are typically implemented using a traditional client-server architecture (one server hosted on an ISP talking to multiple browser-based clients). However, there's no reason why you can't have Web 2.0 cloud applications. In fact, that's most likely the way that all applications will work five years from now. It's useful, however, to separate the concepts in your head. Most current Web 2.0 applications are not cloud based.
What Difference Does It Make?
So, why would you want a cloud application instead of a simple client/server arrangement? Consider the following ping times:
> ping www.google.cn PING www.google.cn (126.96.36.199): 56 data bytes 64 bytes from 188.8.131.52: icmp_seq=0 ttl=239 time=273.340 ms 64 bytes from 184.108.40.206: icmp_seq=1 ttl=239 time=478.394 ms 64 bytes from 220.127.116.11: icmp_seq=2 ttl=239 time=421.920 ms 64 bytes from 18.104.22.168: icmp_seq=3 ttl=239 time=343.003 ms 64 bytes from 22.214.171.124: icmp_seq=4 ttl=239 time=263.843 ms 64 bytes from 126.96.36.199: icmp_seq=5 ttl=239 time=482.231 ms ...
The round-trip time between my desk in Berkeley, California, and one of Google's servers in Hong Kong ranges from a bit over a quarter to almost half a second, and we're sending only 56 bytes. There's essentially no server overhead, but we're hostage to both distance and the speed of the routers through which the data is passing. (A tracert reports only 17 hops, so the latency is probably all distance.) The picture is different when the server is close by. Here are the results from Berkeley to San Jose (12 hops):
> ping google.com PING google.com (188.8.131.52): 56 data bytes 64 bytes from 184.108.40.206: icmp_seq=0 ttl=54 time=19.815 ms 64 bytes from 220.127.116.11: icmp_seq=1 ttl=54 time=20.466 ms 64 bytes from 18.104.22.168: icmp_seq=2 ttl=54 time=35.547 ms ...
A cloud application (or at least the instance of the application that we're talking to), would ideally be running on the machine with the best access time. That's the main advantage — the cloud can effectively reconfigure itself to take care of pesky details like network latency.
However, there's actually no way to guarantee that this reconfiguration will actually happen, which brings us to the dark underbelly of a cloud app: We need to program for the worst case.
Imagine a cloud app that's doing some kind of word completion. Every time you type a character, it's sent off to a server, which finds words prefixed with whatever you've typed. The server sends back a list of possible matches, and your program displays these. Most of the cloud books, in fact, demonstrate this sort of thing in exactly that way — the local application talks to the server with literally every keystroke. Given the look-up times, etc., your user isn't going to be particularly happy with your worst-case response time. You can, however, rethink your strategy. When the first few characters are typed, the server could send you a large, perhaps exhaustive, list of every word that could possibly start with those characters. Thereafter, the application can use that list to update its display rather than going back to the server with every key press. By eliminating the redundant network queries, we make the application much more responsive.
On the plus side, the cloud is amorphous and can indeed reconfigure itself based on observed load. If Google notices that there is a lot of traffic between Berkeley and Hong Kong, it may well replicate the Hong Kong server somewhere in California, and the latency would suddenly improve. The same applies to your cloud application: It will, ideally, be running on several geographically distributed servers, with the topology scaling to accommodate actual requests. In other words, the size of the network (and your cloud-services provider) matters. For cloud services to be effective, the provider has to be large. If you deploy to the Google or Amazon infrastructure, you're effectively leveraging the flexibility inherent in a very large network. By my rather strict definition, an application running on a single server, whether it's an ISP or so-called cloud host, isn't a cloud application at all because it looses the scalability and flexible topology of a true cloud infrastructure.