The Apache team has announced the 0.8 version release of the Apache Mahout machine learning algorithm libraries. Mahout's free distribution is focused on Hadoop-related tasks including collaborative filtering, clustering, and classification.
White PapersMore >>
- Strategy: The Hybrid Enterprise Data Center
- SaaS 2011: Adoption Soars, Yet Deployment Concerns Linger
- Transforming Operations - Part 1: Managing Outsourced Development in Telecommunications
- Big Data and Customer Interaction Analytics: How To Create An Innovative Customer Experience
NOTE: For those of you who didn't read enough of The Jungle Book as a child, a mahout is an elephant rider and the word itself derives from the Hindi and Sanskrit words mahatma.
Mahout's open nature of means non-Hadoop project work is also applicable. This 0.8 version release brings with it bug fixes and performance-improvements.
"Mahout's goal is to build scalable machine learning libraries focused primarily in the areas of collaborative filtering (recommenders), clustering, and classification (known as the "3Cs"),” explains the Apache Mahout team.
Mahout's developers are also working to support the necessary infrastructure for those implementations including (but not limited to) math packages for statistics, linear algebra and others as well as Java primitive collections, local and distributed vector and matrix classes, and a variety of integrative code to work with packages like Apache Hadoop, Apache Lucene, Apache HBase, and Apache Cassandra.
The Mahout development teams says that now as the project moves towards a 1.0 release, the community is working to clean up and/or remove parts of the code base that are under-supported or that under-perform, as well as to focus the energy and contributions on key algorithms that are proven to scale in production and have seen widespread adoption.