Mathematica's Cloud Computing Initiative
I recently interviewed Schoeller Porter, Technical Development Specialist in the Wolfram Partnerships Group, about Mathematica's cloud computing initiatives.
MR: Mathematica is known for its powerful desktop-centric computational aspects and to some degree its cluster computing capacity, but what about its role in the cloud computing space?
SP: Cloud computing is a new area we are beginning to look at. Mathematica is very well known on the desktop and has pretty good inroads in the cluster computing field. But what happens when someone on a desktop computer who has a problem size too large for their current cluster to handle, or they don't have a cluster or the cluster they use is unavailable to them? That's where the cloud comes in. What we're looking at is extending the Mathematica solution on to a dynamic environment of cloud computing. We're at the very beginning of looking into this, but it's a very interesting option to bring to our customers, giving them the ability to grow their computing capability without having to commit to a fixed cost of buying a whole cluster for themselves.
MR: How is Amazon, one of the computing providers you're incorporating into the solution, participating in this new effort?
SP: Amazon is currently one of the largest cloud providers out there with their Electric Compute Cloud. Amazon is an interesting company in the way that they supply the compute capability. They use virtual images to allow you to grow or shrink your compute capacity on demand. They're also offering this service at a reasonable price making it a compelling environment for people who have very large scale computing needs. That said, we're also looking into other providers, one of which is R-Systems. R-Systems is a spin-off of the NCSA, the National Center for Supercomputing Applications, and I believe they currently have the number 44 supercomputer in the world. So we're looking at both extremes. Amazon focuses on the embarrassingly parallel problem and R-Systems delivers on the supercomputing orientation for problems that are more tightly coupled.
MR: Are private and public sector scientists and a majority of your customers seeking cloud computing needs, or are there other particular groups you are attracting to the Mathematica cloud?
SP: We think the Mathematica cloud will be interesting for all of our customers, although it will be most useful to those with desktop computing, using Mathematica on their multi-core machines. Those individuals, such as scientists and engineers, are being asked to look at more data than they're currently capable of handling on their desktop systems. These individuals also don't have easy access to a computing cluster and need to grow their computing capacity relatively quickly to help solve the problems they need to solve.
MR: What kind of user interface will access to the Mathematica cloud look like? Will users be able to simply use the existing Mathematica desktop interface to access those additional computational resources via a web services interface?
SP: That's absolutely right. We want to make sure that the cloud access is intergrated directly within the Mathematica environment itself, mostly so that it's a seamless user experience. Some of these cloud systems require users to go to a web page and upload files, but for our client base, being able to seamlessly access the compute capability of a cloud while working within the notebook structure of Mathematica will be very compelling.
MR: Will that interface also allow people working on distributed teams to simultaneously connect to the compute cloud interface to see the results the cloud is generating?
SP: To be honest, we haven't really thought about that particular use case. I think it would be possible, but I'm not entirely sure how that would work out.
MR: What is the timeframe for the cloud component to be available to Mathematica users?
SP: We're looking at a release the first quarter of 2009, although there will be a preview available for attendees at the upcoming International Mathematica Users Conference in October. We're looking at having a hands-on workshop there so people will be able to work with the product and we can get their impressions of it and continue to improve the product before its final release.
MR: Given some of the scenarios you have constructed to test the cloud, what are some of the most powerful computing configurations you've seen that can be leveraged by the product?
SP: As I mentioned earlier, Amazon offers inexpensive compute capacity for embarrassingly parallel problems as well as providers of very large, tightly coupled clusters such as those managed by R-Systems, with over 2,000 cores and very high-speed interconnect between them. They also have a large memory system there, between 32 and 64 gigabytes of RAM available. These choices will give our customers flexibility to leverage the resources optimized to solve their problems.
MR: Can you give an example of a real-world problem best suited for the Amazon versus the R-Systems cloud configuration?
SP: An embarrassingly parallel problem best suited for Amazon would be one where each one of the tasks on the cluster are independent. That is, each of the processes running are completely isolated from one another, each with their own computational set. Examples of this include protein folding, DNA sequencing or any Monte Carlo-style simulation where the inputs may be random such as in a financing simulation. Conversely, the kind of problems a ideal for a supercomputer such as those supplied by R-Systems could work on are those that are tightly-coupled problems, where the individual computations running on each node are dependent on the other computations running on the other nodes. Examples include fluid dynamics, heat transfer and other multi-physics simulations. A specific example might be studying all the forces interacting on the lift of a wing or the acoustic impact of air rushing over a cabin. These applications run the gamut. There are also ones that fall between these two configurations as well. By providing access to different compute providers, we give you the option to choose which one best suits your needs. We're providing this access through another partner, Nimbus Service, which is a start-up that supplies a common interface to all these disparate cloud configurations, including the ones I mentioned as well as from IBM and Sun. Each of these have their own interfaces and Nimbus is looking at how to allow access to all these different providers through a unified interface.
MR: Would paying customers receive a usage bill from the individual providers or would they receive a consolidated service bill from Wolfram?
SP: Customers would pay Wolfram since the service is being provided by Wolfram Research. We're using the compute providers as a back-end service.
MR: Since this is a metered service with customers paying for compute cycles consumed, will there be a dashboard that operators can view showing what their real-time bill might be?
SP: Absolutely. We will provide facilities to help manage budgets by allowing users to limit the usage of the computing resources and to review a full lifecycle management of the jobs being submitted to the cloud. This queue will show which jobs are currently processing, which jobs are pending, what's finished, how much time they used from where so users can get a good sense of what the costs are for the jobs that were run.
MR: What features would you like to have seen included in this release but were unable to do so due to time and/or resource constraints?
SP: The really nice thing about the way the Mathematica cloud works is that the client side component is fairly lightweight so that the piece that will actually ship in Mathematica is not very complicated and the goal is for that not to be updated that often. Most of the heavy lifting will happen on the back-end.
MR: So users will be able to fire up this cloud plug-in and as Wolfram gathers more partners, these will simply show up on the compute provider list?
SP: That's exactly right. That list will be dynamically updated every time you launch the cloud interface.
MR: And will those providers be able to further advertise their capabilities to help differentiate their services from the other alternatives right next to them on the list?
SP: Absolutely. We haven't quite figured out how that's going to look but we're working with the providers to see what would be the best way to advertise the information about their systems and be able to differentiate them from eachother.
MR: As a customer, I would like to be able to let Mathematica help me find the provider based on my budget versus performance needs by rating the cloud providers best matching those constraints and sort them by price and/or performance.
SP: Sure, and I think the key here is not really just to sort and look but really help guide the end user to the system that is right for them. I understand most of your readers are very technically minded and probably wouldn't have a difficult time discerning the different parameters of these cluster configurations, but there are a large number of people who don't have that depth of experience. They're going to need different kind of guides to help them identify which one of these systems is best for them. It may be based on price or configuration, but they will need pointers along the way. And so that's something we'll need to provide as well.
MR: Great! I can't wait to see the preview in action at the Mathematica International Users Conference in October.
SP: I look forward to seeing you at the Users Conference and I invite your readers to come to the conference as well and show them what we're doing here at Wolfram Research.