Channels ▼

Ken North

Dr. Dobb's Bloggers

Supercomputing, The Cloud, Big Data, and NoSQL

January 13, 2012

If you count yourself among the informed members of the software and computing community, you're undoubtedly aware of NoSQL, "Big Data", cloud computing, and supercomputing. Sometimes technology that has become trendy is a branch on an evolutionary tree; other times it's a revolutionary departure from long-established status quo.

The arrival of new technology often rekindles the pervasive debate over the merits of "tried-and-true" versus "new and improved". The latter often introduces new words in our lexicon, with recent examples being Big Data, NoSQL, and cloud computing. Supercomputing has been with us for a while but there have been significant strides in 2011, including IBM Watson, Tianhe-1A, and an Amazon virtual supercomputer.

IBM Watson can process 200 million pages of text in 3 seconds. (How's that for having enough capacity for big data workloads?) China claimed the supercomputer crown with Tianhe-1A and its capacity to perform 2.5 thousand trillion calculations per second. Tianhe-1A is 50% faster than the XT5 Jaguar at Oak Ridge National Laboratories. One of the more interesting approaches to solving large-scale computing problems is the Amazon virtual supercomputer. This was an ad hoc solution for an Amazon EC2 user, a pharmaceutical company that spent $1,279 per hour to rent 30,000 cores. That virtual supercomputer had enough capacity to rank 42nd on the list of the top 500 supercomputers.

My previous Dr. Dobb's blog post discussed the surge of interest in the cloud and Big Data (Terabytes to Petabytes: Reflections on 1999-2009).

Having enormous computing and storage capabilities is undoubtedly a prime factor in the growing importance of Big Data. We have capacity for analytics and data visualization that was unheard of a decade ago, including the ability to process large data volumes from disparate sources. These data sources include SQL and other structured data (click streams, web logs, RFID and sensor data, high-speed, low-latency data feeds), and a host of unstructured data, such as Tweets.

The desire to build social networks and web-scale applications has led to being able to support millions of users, and store and process information about hundreds of millions. The availability of seemingly unlimited capacity has generated enthusiasm for Hadoop and other solutions for processing large data sets. The major players in the SQL database space, for example, are integrating Hadoop with their database product line.

These new computing and storage requirements have revived, in some circles, a debate over whether to supplant tried-and-true languages, architectures, and database solutions. Important topics in recent debates have concerned attributes and capabilities of different database solutions. The topics in focus have included horizontal scalability and sharding, ACID versus BASE properties (consistency), schemas and type support, granularity of encryption, and query methods.

One of the more interesting debates is about types, schemas, type-less programming, and schema-less databases. I'll take a closer look at these issues in an upcoming blog post.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video