Scalability and Cloud-Service Architecture
A significant advantage to a cloud infrastructure is automatic scalability, but here's one place where the basic architecture matters. Amazon's Elastic Compute Cloud (EC2), like most cloud providers, rents you a "virtual machine" to host your application. Your VM may or may not share a physical machine with other apps, and it has an unknown number of physical processors attached to it. At its heart, though, your EC2 VM is just a Linux (or Windows) box, and you can configure it however you want. You typically pay only for the time that the VM is actually busy doing something, which is great for a software startup that's effectively getting rack space for free. As the load increases, so do your expenses (but hopefully, so does your revenue). You use your VM pretty much the same way you'd use a shell login to a shared server at an ISP, deploying with FTP, etc.
The inherent flexibility of a hosted-VM approach is particularly important the day that your application gets reviewed in The New York Times, and suddenly you have 1000 hits/second. Your ISP-hosted shared server would just crash at this point. An EC2 VM will scale, however, running on a dedicated machine if necessary, with cores added as necessary. Amazon will automatically increase the "umph" of your VM — giving you more machine cycles on the physical machine, for example, or assigning more cores to your application. Of course, you'll pay for this extra umph.
The main downside of this approach is that there is an effective upper limit on the scalability. Adding cores can get you only so far, and there's a diminishing return on the number of cores. Eventually, you're using everything that the machine can give you. What if that's still not enough to handle the volume? In theory, your app can be placed on several machines at this juncture, with Amazon handling the load balancing (you can run several EC2 VMs in separate physical locations that you specify), but that scaling doesn't happen automatically, and the app has to be written with scaling in mind.
That is, if you're really planning on scaling, you have to do exactly the same amount of programming work that you'd do if you were running the application on multiple machines in your own data center. This is a nontrivial amount of work. So, Amazon and its brethren give you a lot of flexibility in configuration. You can put anything you want on your virtual Linux box, write your app in any language, augment it with custom processes; go crazy! The downside is that you have to worry about administration and scaling, and that can add to the complexity (and cost) of the application very quickly.
Fortunately, there is another approach — the one used by the Google "App Engine." Google doesn't rent you a VM at all — you have no control of the operating system and can't install arbitrary applications on "your" machine. Instead, you rent time on a virtual application server (think Tomcat). You write your application in an approved language (Python or Java) and you deploy your application directly to Google's app server, not to the operating system. For example, if you're using Java, your application is a standard Java "web app" packaged into a WAR file and deployed to Google exactly the same way that you'd deploy to a Tomcat instance, by uploading the WAR. Google handles the Tomcat part. (It's not actually using Tomcat, but I usually test locally using Tomcat and haven't found any problems. Google's own development tools use Jetty.) Part of deploying the app is telling Google what URL to use to access it. You can use both a Google-provided URL (something.appspot.com), or a subdomain of your own domain.
On the downside (to paraphrase Henry Ford): Your app can come in any color, provided that it's black. Your choice of implementation language is Java (my own predilections preclude writing an enterprise application in Python). You have to structure your application as a Java web application, built around servlets, and you have to access your data using JDO or JPA (there's no JDBC support).
Google's working on adding SQL (due to be released within the next few months), but it's not there yet, and is available only to "App Engine For Business" customers. Unfortunately, Google's pricing model for the "For Business" customers effectively makes SQL inaccessible to a standard web application meant as a public Software-as-a-Service (SaaS) app. Google charges $8 per year per user for a "For Business" application, which makes sense if you're implementing your HR application on Google instead of running it in your own data center. But a per-user fee is nonsensical if you're writing a SaaS app to expose to the entire Internet. The standard (not "For Business) App Engine charges are based on CPU and data usage, not the per-user model. Google has made similarly stupid (a technical term we analysts use) choices on other fronts as well. For example, a standard App Engine application can use SSL only if you deploy the page to a Google URL (MyDomain.appspot.com), which could be disconcerting to one of your users if they look at the address bar. Similarly, though you can host subdomains on the Google App Engine, you cannot host your main domain on Google. You have to get an account with a standard ISP, and then redirect access to a Google-hosted subdomain. (For what it's worth, www.foo.com is a subdomain, so it can be hosted on Google. It's the foo.com, without thewww, that's the problem.)
Amazon EC2, on the other hand, gives you several database choices: You can run an RDMS on your VM, you can use Amazon's Relational Database Service (RDS), or you can use Amazon's SimpleDB service if you're doing something very simple. You can easily host your domain on an EC2 instance, and you can easily access that domain using SSL (because you're just accessing your own instance of Apache, running on the VM).
So, to sum up the differences before moving on to other issues: Google provides a better programming environment, with easy deployment, and very good scalability; but, Google's services are marred by an inability to easily host your top-level domain, inability to use SSL with your domain's URL, and lack of SQL support. The last two can be resolved if you're an "App Engine For Business" user, but the pricing model for that service effectively makes it useful only for large companies who want to move in-house applications from their own data centers to Google, something that I have a hard time believing will happen.) Amazon has none of those particular problems, but system administration is difficult with EC2, and scalability is not fully automated. It's the scalability issue that's the show stopper for me, so I'm using the Google App Engine in spite of its limitations.