Channels ▼
RSS

Tools

Pentaho's Helping Hand for Big Data Developers


Pentaho has expanded its data integration software portfolio with support for what it lists as a "major expansion" of native big data sources, including the latest Hadoop distributions, NoSQL sources, as well as native support for several analytic databases and traditional OLTP databases.

The company says that its native connection to big data platforms makes it easier and faster than ever to analyze the enormous data volumes generated by today's organizations.

Speaking exclusively to Dr Dobb's Journal, VP of product management at Pentaho Jake Cornelius said, "Pentaho's goal with big data is to provide appropriate tooling that makes it easier for developers and application architects to build and manage data integration and Business Intelligence solutions with big data technologies like Hadoop, NoSQL variants, and high performance/scalable data warehousing platforms."

Claiming to have "recognized early" the complexity and diversity of big data, and the growing need to support its volumes, Pentaho now openly declares that it offers deeper and more comprehensive support for Big Data sources than any other BI vendor.

Aiming to provide some validity to those claims, Pentaho's Cornelius says that his company's current integration points provide a number of benefits to developers including:

  • The ability to orchestrate execution of Hadoop related tasks (i.e., executing a Hive Query, Pig Script, or M/R job) as part of a broader IT workflow.
  • The ability to setup dependencies, so if a step fails the job can branch down a recovery path or send a notification, or if it's a success it goes on to subsequent dependent tasks. Likewise it supports initiating several tasks in parallel.
  • New integration for Pig — so that developers have the ability to execute a Pig job from a PDI Job flow, integrate the execution of Pig jobs in broader IT workflows through PDI Jobs, take advantage of our out of the box scheduler, and so on.

Technical Note taken from http://pig.apache.org/: Apache Pig is a platform for analyzing large data sets that consist of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.

"Hortonworks and Pentaho share a vision whereby Apache Hadoop becomes the de facto platform for storing, managing, and analyzing big data. We are focused on accelerating the development and adoption of Apache Hadoop and are excited to be working with Pentaho to further simplify the development and deployment of Big Data projects," said Eric Baldeschwieler, CEO, Hortonworks.

While traditional OLTP databases are typically not considered "dig data" platforms, Pentaho says it maximizes their performance and scalability through native SQL dialect generation for fast analytics, or native bulk loader integration for fast data integration. OLTP databases with native Pentaho support include: Apache Derby, Firebird, HyperSQL, IBM DB2, IBM Informix, Ingres, Interbase, Microsoft Access, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video