Channels ▼
RSS

Web Development

Data Persistence in the Cloud with Amazon Web Services


Zero-Maintenance Persistence

Amazon's SimpleDB is essentially a cloud-based, highly scalable and highly available persistent map. Rather than a relational model that stipulates an upfront rigid schema, SimpleDB is very flexible in terms of how you decide to model and store data. This flexibility, combined with the fact that SimpleDB is always there and always on, means you can rapidly build persistent applications.

Conceptually, SimpleDB is somewhat similar to a relational database: In SimpleDB, the top level data container is a domain (somewhat akin to a table). Domains then have items (akin to rows), which are then composed of attributes (akin to columns). But attributes in SimpleDB are really tuples — name/value pairs. Moreover, the pair aspect isn't limited to one value. That is, an attribute name could be "names" and the corresponding value could be a collection of names like "Andy," "Jim," "Chris," and "Danny." Like a relational database, SimpleDB supports a SQL-like query language for finding data. Inserting, updating, and deleting data, however, is done via a web-based API.

SimpleDB supports only one type — strings. This single-type approach might seem somewhat limiting. However, it turns out to be a minor inconvenience that can be easily overcome via the many libraries you can use to manipulate SimpleDB data. Lastly, there is no notion of joins in SimpleDB — that is, you can't query across domains. In return for these stark differences, you get an extremely fast, always-on datastore that is capable of massive scalability.

When using SimpleDB you need to be aware of a big difference between it and traditional relational systems: eventual consistency. Eventual consistency, in the case of SimpleDB where data resides across multiple nodes on the AWS infrastructure, means that all nodes take a fraction of time to become mirrors of each other. That is, an update to an item might not be immediately reflected in a concurrent read (something that you've come to expect in RDMBS-land). Thus, while SimpleDB does guarantee massive scalability along with extreme availability, consistency is sacrificed. Do keep in mind, however, that consistency does occur in a matter of seconds.

SimpleDB and Java

Getting started with SimpleDB using the AWS Java SDK is simple. Provided you have an account with AWS, you can start using SimpleDB with a few lines of code. The SimpleDB API is fairly straightforward and similar in respects to the API I showed you last month. For instance, there is a pattern of naming various objects with "request" for interface chaining.

To create a domain in SimpleDB, you issue a CreateDomainRequest, which will create the domain in SimpleDB if it doesn't already exist or do nothing if the domain is already there. (The following examples are written in Groovy, but are pretty straightforward.)

def sdb = new AmazonSimpleDBClient(new PropertiesCredentials(new File("./AwsCredentials.properties")))
def domain = "widgets"
sdb.createDomain(new CreateDomainRequest(domain))

Next, to create an item (that is, a new row) in the "widget" domain, you create an instance of ReplaceableItem and chain a series of attributes to it. Finally, you then create a BatchPutAttributeRequest and send it off to AWS via an instance of AmazonSimpleDBClient.

def data = []

data << new ReplaceableItem().withName("widget_1").withAttributes(
	 new ReplaceableAttribute().withName("name").withValue("px-34"),
         new ReplaceableAttribute().withName("price").withValue("0045.50"))

sdb.batchPutAttributes(new BatchPutAttributesRequest(domain, data))

Notice how the "price" attribute has two leading zeros — because all types in SimpleDB are strings, queries are performed lexicographically. Thus, in order for numeric data to be searchable, it must be the same length. Consequently, all prices, by convention in the aforementioned code, are 6 digits making searches for, let's say, widgets less than $45 work correctly.

It should be noted that the AWS Java API provides helper methods that take care of numeric padding. Consequently, I could have set the attribute value of price like so:

data << new ReplaceableItem().withName("widget_2").withAttributes(
         new ReplaceableAttribute().withName("name").withValue("px-34a"),
         new ReplaceableAttribute().withName("price").withValue(
             SimpleDBUtils.encodeZeroPadding(45.50, 4)))

Querying for data is similar in respects to normal SQL; however, the biggest difference is that domains and item values must be properly quoted in SimpleDB's query language (not to mention, numeric values must be padded). Thus, to find all widgets in my domain with a price less than $50, I must fashion a query as follows:


def query = "select * from ${SimpleDBUtils.quoteName(domain)} where price < " + 
"${SimpleDBUtils.quoteValue(SimpleDBUtils.encodeZeroPadding(50, 4))}"

def sdbrequest = new SelectRequest(query)

sdb.select(sdbrequest).getItems().each{
  println("Name: " + it.getName())
}

Note the various method calls used to quote names and values in addition to encoding my search value.

The SimpleDB Java API supports creating data and finding data, but it also supports deleting or updating items. The zero administration required to get up and running is a benefit, too. Lastly, there are myriad libraries built on top of SimpleDB (in addition to the SDK offered by AWS) that make working with it even easier.

AWS offers a number of options for data storage. From installing your own datastore on a custom EC2 image to finding an existing AMI configured to your desire, you have a lot of choices and you can opt to exert a very a low level of control over your datastore. If less administration is your desire, then RDS is an excellent choice for highly available, on-demand instances of MySQL or Oracle. And if you're looking for extreme flexibility when it comes to a data model and have little time for administration headaches, then SimpleDB makes an excellent choice.


— Andrew Glover is the CTO of App47, a company specializing in enterprise mobility. He also is the author of easyb, a BDD framework that won the Jolt Award in 2009. Previously, he was the President of Stelligent.


Related Reading

Getting Started with the Cloud: Amazon Web Services


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 
Dr. Dobb's TV