The DataStax Python Driver 1.0.0 for Cassandra is now final after it was initially open sourced in mid 2013. DataStax distributes a commercially supported version of the Apache Cassandra NoSQL database management system. Company software engineer Michaël Figuière says that the team has been working to improve stability and performance of the API.
The now "production ready" driver is part of the firm's plan to provide modern open source Cassandra drivers for major programming languages. Figuière hopes that the DataStax drivers will simplify the work of developers and DBAs when designing applications by bringing a common architecture and a similar interface across languages.
Figuière clarifies that although the Python driver may look similar to drivers for relational databases, it comes with some configurable, per-node connection pools that can grow and shrink automatically to accommodate changing loads. There is also synchronous and asynchronous query execution, query tracing, and metrics providing more insight into query execution, latency, and errors.
As well as SSL and authentication support, there is "thorough logging" based on Python's standard logging module. This version of the Python driver runs on Python 2.6 and 2.7 and (as the driver can be run without any C extensions) PyPy is well supported. Support for Python 3 is planned.
"At its core, the new Python driver utilizes an event loop for handling communication with Cassandra. This event loop may either use the asyncore module in the standard library or libev for improved performance. At a higher level, the driver maintains a small connection pool for each Cassandra node (with special treatment for multi-datacenter environments)," explains Figuière.
From a user-API perspective, you can choose to synchronously block for the query to complete, or you can execute the query asynchronously and either attach callbacks or synchronously block for the final result at any time.
"Cassandra 2.0 added several interesting features for developers, including automatic query paging for large result sets, lightweight transaction support, and the ability to execute prepared statements in batches. The 1.0 release of the Python driver does not support these yet, but the 2.0 release will add support for them," said Figuière.