The Apache team has announced the 0.8 version release of the Apache Mahout machine learning algorithm libraries. Mahout's free distribution is focused on Hadoop-related tasks including collaborative filtering, clustering, and classification.
- Wrangling Actionable Insights from Organizational Data
- The IT Manager's Guide to Deploying Social Business Software
NOTE: For those of you who didn't read enough of The Jungle Book as a child, a mahout is an elephant rider and the word itself derives from the Hindi and Sanskrit words mahatma.
Mahout's open nature of means non-Hadoop project work is also applicable. This 0.8 version release brings with it bug fixes and performance-improvements.
"Mahout's goal is to build scalable machine learning libraries focused primarily in the areas of collaborative filtering (recommenders), clustering, and classification (known as the "3Cs"),” explains the Apache Mahout team.
Mahout's developers are also working to support the necessary infrastructure for those implementations including (but not limited to) math packages for statistics, linear algebra and others as well as Java primitive collections, local and distributed vector and matrix classes, and a variety of integrative code to work with packages like Apache Hadoop, Apache Lucene, Apache HBase, and Apache Cassandra.
The Mahout development teams says that now as the project moves towards a 1.0 release, the community is working to clean up and/or remove parts of the code base that are under-supported or that under-perform, as well as to focus the energy and contributions on key algorithms that are proven to scale in production and have seen widespread adoption.