Pervasive Software has announced an update to its Pervasive DataRush, offering extensible libraries for data preparation and analytics, as well as integration to the KNIME open source workflow interface for data mining.
Pervasive DataRush is an embedded parallel dataflow platform that helps eliminate performance bottlenecks in data-intensive applications. Its expanded capabilities enable a broader range of users to address the growing challenge and complexity of big data. The company claims that DataRush provides reduced runtimes for data preparation and analysis and enables the consumption of very large datasets without having to rely on sampling or the use of expensive, high-maintenance clusters.
With the ability to dynamically scale to utilize multicore technology, Pervasive DataRush-enabled applications can quickly overcome typical data preparation bottlenecks involved in cleansing, aggregating or de-duplicating data. This new release extends the platform to perform data mining and predictive analytics, allowing organizations to extract timely knowledge from their large data sets and enable more informed decisions.
"The scalability of Pervasive DataRush across cores and data sizes, its high throughput, extensibility, ease of implementation and cost-efficiency, allows users to 'future-proof' their applications to automatically take advantage of increased core counts as multicore hardware goes from dual-core to 8, 12, 24 or 48 cores and beyond," says Pervasive's Ray Newmark. "It takes care of complex parallel programming issues so developers can focus on their core challenges rather than on the intricacies of developing highly parallel multicore-ready applications."
The enhanced Pervasive DataRush capabilities, now generally available and accessible as a trial download include:
- New core analytics library including: k-Means clustering, naïve Bayes, decision tree (C4.5) and k-nearest neighbor classification algorithms
- Expanded data preparation capabilities
- New parallel analytic nodes for KNIME
- Java SDK to extend and customize data preparation and analytics capabilities