Channels ▼
RSS

JVM Languages

Where Does Big Data Go To Get Data-Intensive?


Concurrent Inc. has gone to market this month with its Cascading 2.0 framework for Java-based "big data" apps running on Apache Hadoop. As an alternative API to the MapReduce programming model for large data sets, Concurrent claims to have a "growing" ecosystem of Cascading developers and partners.

Cascading is intended for use by Java developers who are building data processing and data management applications on Apache Hadoop that can be deployed on clusters running in the cloud or within private data centers. Cascading is used to streamline data processing, data filtering, and workflow optimization for large volumes of unstructured and semi-structured data.

Speaking directly to Dr. Dobb's, Florian Leibert (who is a software engineer and developer at vacation accommodation company Airbnb) said that his company chose to use Cascading on Amazon's Elastic MapReduce service for the heavy-duty infrastructure work of filtering and combining multiple large data files and reconstructing corrupted files. "The data is used by analysts to determine the factors driving room bookings as well as user drop-offs, to better understand user behavior and business dynamics," he said.

Cascading is also at the core of language extensions including PyCascading, Scalding, and Cascalog (open source projects sponsored by Twitter) and tools including CloudFront LogAnalyzer (developed by Amazon).

The company's promise to application developers is an opportunity to build and test applications on their desktops in the language of choice (Java, Jython, Scala, Clojure, or Jruby) with familiar constructs and reusable components and then "instantly deploy them" onto clusters of 100s of nodes.

Concurrent also says that Hadoop administrators can now seamlessly move and scale application deployments from development to test and production clusters regardless of cluster location or data size.

"We make it easy for developers to build powerful data processing applications for Hadoop, without requiring months spent learning about the intricacies of MapReduce," said Chris Wensel, CEO and founder of Concurrent Inc.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video