Amazon Web Services
Amazon Web Services (AWS) emerged directly from the challenges in running amazon.com and incorporates an Amazon perspective. AWS offers the cloud at two distinct levels -- raw computing resources and ready-to-go appliances. The former offers traditional computer resources, such as processors, memory, file system/database, and messaging. The ready-to-go appliances build useful applications and/or services on top of the raw resources. Amazon supplies and manages the raw computing resources. Amazon partners typically provide the ready-to-go appliances. AWS charges a la carte for usage of the raw resources -- processor, memory, network traffic, and storage. Partnering appliances are free to attach a surcharge to the resources that implement their offerings. Amazon is actually the enabler. It depends on partners or you to create useful applications. Additionally, third-party solutions have emerged to manage AWS resources directly. At this writing, AWS consists of five primary services:
- Elastic Compute Cloud (EC2) offers virtual machines (VMs) containing processors, memory, and storage. All of the VMs are Linux variants. They just started to offer proprietary operating systems such as Windows Server and Oracle Enterprise LInux. A cloud application can dynamically allocate and deallocate the VMs. EC2 offers a variety of VM sizes from small 32-bit processors to 64-bit multicore processors, all with corresponding memory and storage. Currently, EC2 is in beta and does not offer a service-level agreement (SLA). EC2 machines may associate themselves with AWS-allocated name elastic IPs. You can protect each VM with a configurable firewall called security groups. A security group enables protection via IP source, to/from IP port, IP protocol, user, or group. You may maintain different security groups for different VMs.
- Simple Storage Service (S3) offers large storage. The storage is simple and direct. You declare buckets and place data objects into the bucket. Each bucket and object maintains a security profile that controls access. Several third-party tools use the API to provide a filesystem-like view into the buckets. Each bucket can also tie to a URL to provide direct access. S3 offers an SLA.
- Simple Queue Service (SQS) offers a highly reliable message queue. SQS guarantees delivery of messages and stores messages for later transmission if required. SQS interconnects all of AWS resources. Applications publish and subscribe to a specified queue.
- SimpleDB provides a query interface to structured textual data. The query language and data formatting is, well, simple. Your primary data can associate with up to 256 attributes or metadata. Your application can then use basic Boolean operations to query the information. This is currently in limited beta.
- Cloud Front provides efficient access to distributed content using Amazon's array of servers throughout the world. It integrates with S3, which holds the original content that is then distributed appropriately to edge servers. It provides high data transfer rates and low latency. As with the other services, price is based on usage.
To access any of these services, you must get an account and activate the services you require. After activation, you can interact with the services via an access key or X.509 certificate. You can obtain an account, access key, and certificate for no cost. Your applications interface with AWS via three primary methods: REST, Query, and SOAP. The REST interface forms a standard HTTP or HTTPS request message containing data within the request body. The Query interface also uses HTTP but relies on simple name and value pair parameters so that basic browsers can perform service operations. The SOAP interface uses XML documents as described in a WSDL. AWS supports these methods as follows:
- EC2 supports Query and SOAP.
- S3 supports REST and SOAP.
- SQS supports REST, Query, and SOAP.
- SimpleDB supports Query and SOAP.
In addition to these low-level interface methods, there are two other useful interfaces -- the AWS toolkit and higher-level, third-party offerings.
AWS persistence relies on S3. Your account lets you create persistence storage buckets. You are currently allowed 10 buckets and they must be unique throughout all of S3. You can choose to expose the bucket to an Amazon URL, if named accordingly. You can then map this URL to a URL that you control. You can also control where the bucket is physically located.
To illustrate the interfaces, we demonstrate two third-party tools to examine current buckets. JetS3t shows the lower level bucket configuration in Figure 1. You can see we've have declared two buckets -- one that is the default and one using a URL.
Go to pictures.ebmagic.com/SmokingBishopToast.jpg to see a picture linked to a private hostname but stored in S3. Thus, you can make any bucket and/or object addressable. Cockpit mimics a file manager in Figure 2.
This tool makes the buckets look just like a filesystem. You can also see that the object key uses mnemonics that represent the various directories. Cockpit also offers backup utilities to continuously sync up declared directories/files. Thus you can backup your files directly into the cloud for safekeeping. Storage need not use any of the other AWS resources. For example, you could simply use S3 to store all your documents and photos and make them accessible to the Web.
The real heart of AWS is processor access. To get started quickly, you would launch an existing image of an operating system configuration. The Firefox ElasticFox plug-in offers a nice view into what is available (Figure 3).
Figure 3 finds available images that contain the string fedora. There are many available offerings, including those that contain MySQL and Apache. The image ami-25b6534c is selected -- note the visibility. You can limit access to any images that you create. Next, one is selected and launched (Figure 4).
This presents several options -- the instance type selects the machine size. The user has selected 'small.' You can also set a minimum/maximum number of in stances. This ensures that a certain number of instances are always running and your user base cannot exceed the maximum. This gives you an initial hint at how easy it is to scale an application. You can also set the security, matching key pair, and a location; and create as many different configurations as necessary.
Next, hit the launch button and it's off and running (Figure 5).
You now have full access to the machine. For example, you could then use an elastic IP address to attach the image to a specific IP (Figure 6), then map the IP to the domain name through whatever hosting facility you use. Note that you can change the binding between an instance and the IP address.
Next, you would normally use an ssh client to connect to the instance. Port 22 is open in the default group. Figure 7 shows the ssh connection that lets you change the configuration and download additional files.
And there you have it. A fully running image set to a specific hostname and ready to serve up your application from the cloud.
The Amazon toolkit provides the tools to create your own image, upload it into S3, and associate an image ID. As with any object in S3, you can set permissions to allow access to the image.
To allow multiple virtual machines to cooperate with one another, your application can employ the SQS to handle messaging. SQS offers some features beyond normal messaging. Messages, although properly delivered, aren't deleted until specified by the subscribing application. If the message remains undeleted for a specified period of time -- it magically reappears. Thus, if the subscribing application read the message but then failed, the message reappears to be consumed again.
The last component is SimpleDB. Instead of employing your own database within your virtual machine, your application can use the SimpleDB for basic test indexing.
One caveat to all this AWS fun: Only S3 and SimpleDB are persistent. If you deactivate a virtual machine or it fails, all of its memory (including allocated disk space) is also deactivated. Therefore, you must move critical information into the SimpleDB or S3.
While you can get the latest pricing at the AWS site, here are some typical costs:
- AWS $ Small machine $0.10/hour, $0.100 per GB in, $0.170 first 10TB out
- S3 $ $0.15 GB/Month, $0.100 per GB In, $0.170 first 10TB out
- SQS $: $0.01 by request, $0.100 per GB In, $0.170 first 10TB out
- SimpleDB $: $0.100 per GB in, $0.170 first 10TB out and $1.50 GB/Month
The costs appear nominal but can add up as you take advantage of the operations. For example, a small machine allocated for an entire month would cost over $72. This may exceed the cost from a typical ISP. The secret is in the elasticity. You probably don't need the computer for every hour in a month.