SnapLogic has today announced its latest integration product designed to provide connection and integration of large data between business applications, cloud services, social media, and Hadoop. SnapReduce's unified platform makes Hadoop processing more accessible and is said to result in optimal Hadoop cluster utilization.
Unlike other Hadoop integration solutions that rely on higher level languages such as Pig and Hive, SnapReduce matches SnapLogic dataflow pipelines directly to equivalent MapReduce jobs, resulting in more efficient data processing and pipeline monitoring on Hadoop.
Additionally, SnapReduce leverages SnapLogic's new Hadoop Distributed File System (HDFS) Snap for getting big data in and out of Hadoop. The HDFS Snap allows SnapLogic pipelines to integrate data from all of SnapLogic's end points and provides users a common framework in which to formulate both data retrieval and data processing in Hadoop.
SnapReduce also works side-by-side with existing applications and tools that process data within Hadoop, such as Pig, Hive, and Flume. This allows SnapReduce to work with existing Hadoop workflows and computations. SnapReduce will be commercially available in the second half of 2011.
"Businesses hungry to mine big data using Hadoop are hitting a wall when it comes to connectivity with business applications. Current solutions for big data integration require strong parallel programming skills and are thus complex and time-consuming to create, often requiring several technologies to get and use big data with Hadoop. As companies add cloud applications to their stable of enterprise applications, they need an easy solution for connecting to Hadoop that can reliably handle both the volume and complexity of big data," said Gaurav Dhillon, CEO, SnapLogic.


