Pentaho used the MongoDB World symposium this month to release the 5.1 version of its business analytics and data integration software. Of interest to developers here is the option for "code free" analytics on MongoDB.
Before this release, says Pentaho, preparing MongoDB for analysis required lots of manual coding, making it harder for development shops that lacked this expertise to "scale up for big data operations", as the industry loves to say these days.
Pentaho's Christopher Dziekan talks about bringing developers into proximity with power big data analytics at scale. The release enables MongoDB data collections to be analyzed directly "at the source", eliminating hand-coding and the need to prepare data in a staging area.
"Traditional RDBMS analytics can get very complicated and quite frankly, ugly, when working with semi or unstructured data," said Chris Palm, lead software architecture engineer at MultiPlan. "We have seen more accurate results with new analyses and are no longer constrained by having to pull only part of our data. We can now look across a more full set of data and govern our system of record."
Pentaho 5.1 also features a new "Data Science Pack," which helps blend different data sources (such as social and MongoDB) to enable advanced analytics like churn prediction and customer sentiment.
Full YARN support features here. Pentaho developers familiar with Pentaho Data Integration can "exploit the full computational power" of Hadoop, without having to write complex MapReduce code. Apache Hadoop YARN (Yet Another Resource Negotiator) is a cluster management technology that exists as a key feature in second-generation Hadoop.
According to Pentaho, "With YARN, Pentaho Data Integration jobs can make elastic use of Hadoop resources, expanding and contracting as data volumes and processing requirements change. YARN's advanced resource management capabilities support mixed Pentaho workload scenarios where continuous data transformation and analysis is required."