Wolfram Research is on the brink of unveiling Wolfram|Alpha, its computational natural query language interface engine. A pet project of company-founder Stephen Wolfram, its public introduction fulfills a dream of his since he created his company more than 25 years ago. I recently had the opportunity to talk with Wolfram at length about Wolfram|Alpha, particularly in regards to how he envisions application developers to leverage Wolfram|Alpha's power under the hood.
Wolfram introduces WolframAlpha in a humble, straightforward way:
This is a very ambitious project. The concept of Wolfram|Alpha is "How much of the world's knowledge can be made computable?" It builds upon the Mathematica platform that we have created to encode all possible kinds of knowledge to be computed. The mode of interaction is you ask a question, you get an answer.
Questions you can pose to Wolfram|Alpha can range from simple queries ("What is the GDP of China?") to highly complex mathematical expressions. Answers can range from a simple chart and/or graph of data to a step-by-step calculation of the answer.
If you're already familiar with Mathematica, you instantly acclimate to the interface as well as know how to further manipulate the result set. In fact, the output can be exported into Mathematica Notebook .nb formatted files for further manipulation in the various editions of Mathematica, including the free Mathematica Player. This allows further manipulation of the data, especially those exported datasets wrapped with Mathematica's Manipulate function.
Several additional details accompany the result, including any data sources polled for the result. These sources currently range from medical and financial to meteorological and geographic information. Wolfram was quick to point out, though, that while the site will certainly look familiar to Mathematica customers, Wolfram anticipates that the site will be used by a much broader range of people than have been so far exposed to Mathematica. When asked how he envisions how existing Mathematica users may adopt Wolfram|Alpha, Wolfram responded:
We're dealing with two ends of the spectrum. Wolfram|Alpha is currently optimized for the instant piece of knowledge, instant computation -- great for the drive-by question, so to speak. On the other end, Mathematica users may require a systematic analysis that may at first scale from a small question and then build up to larger, more complex production systems. Indeed, Worlfram|Alpha is built using this construction approach, consisting of nearly 6 million lines of Mathematica code.
When asked about the possibility of Mathematica integrating with the site, Wolfram enthusiastically replied:
From within Wolfram|Alpha, we can generate the Mathematica expression that lies behind the result set as plain text that can be copied from the site and pasted into a Mathematica notebook. In a future version of Mathematica, there will be a tight integration with Wolfram|Alpha. By simply typing an equal sign at the beginning a line, Mathematica users will get a Wolfram|Alpha input field right in the notebook itself. In other words, Mathematica will act as a client for Wolfram|Alpha, so that you can then use the linguistic capabilities of Wolfram|Alpha and get the results back directly within Mathematica.
In addition to formulae and appropriate statistical data, result sets may also include "Just the Facts" links to narrative information such as relevant Wikipedia entries.
The Developer API
At the time I explored the Wolfram|Alpha beta, a 12-page WolframAlphaAPI.pdf document was available for download from the site's 'More -> For Developers' menu. This preliminary documentation, dated March 27, 2009, describes the simple RESTful calls that can be made to the site to selectively filter the kinds of result sets needed by the calling client application.
Two types of API's are mentioned in this document -- the Query API and the Data API. The majority of the document details the parameters and types of result sets and formats returned from a query. The Data API was briefly mentioned on the last page, supplying services to extract highly specific data sets from Wolfram's "enormous store of curated data." However, since no additional details were disclosed, the remainder of this section will focus on calling the Query API and the type of formats the return set can contain.
Queries can currently consist of nearly 10 different parameters. These include input, appid, pod, podcount, format (values range from text, img, inputform and html), timelimit (the timeout limit to wait for results), allowedcached (pull data faster from a possibly previously cached result, or slower to assure the very latest, up-to-date data), async (userful for AJAX-style dynamic page <div>section rendering) and others like assumption, moreoutput, width and parseresult. XML Resultsets are wrapped in a <queryresult> tag containing several attributes, including success, error, numpods, timedout and timing. Many other tags exist as well, including assumptions (which may contain individual assumption tags about the meaning/related categories of a word(s) in the query statement), img, markup (if HTML output is requested), and errors if any are generated.
According to the API documentation, the parse function "is a specialized function that performs only the initial parsing phase of Wolfram|Alpha processing. Its purpose is to quickly determine whether Wolfram|Alpha can make sense of a given input, bypassing the more time-consuming stages of fully analyzing the input and preparing results."
An example URL for the parse function would look like http://api.wolframalpha.com/v1/parse?input=chicago&appid=myappid. The results are wrapped in a <parseresult> tag that can also contain assumption(s) tags as well.
Unfortunately, I was unable to actually validate the RESTful Python scripts I wrote to test the service since a call to the API requires a Wolfram-provided developer appid to be passed along with each HTTP GET request making the call. The API documentation did not make any mention of a call limit, though I suspect limits will be imposed either in the number of queries per day (similar to Google's API) and/or the size of the result set being returned. There was also no mention of fees associated with using the data beyond non-commercial use, though this too should not be a great surprise to developers if and when such charges might arise. However, as long as the return results remain free to web visitors, it's probably safe to assume that the API calls, at least some limited form, will remain free as well. Otherwise, creative hackers will no doubt construct their own wrapper libraries for their favorite programming languages anyway, diluting Wolfram's necessary structured control of the API's future direction. Given their smart choice on going with a simple RESTful interface, there is no doubt that numerous language examples (with Java, C#, Python, Perl and Ruby code snippets leading the way) will show up on the web shortly after the site goes live.
Obviously, Wolfram will be using this API for their own products, and the RESTful services are what will likely be leveraged to achieve the inline Wolfram|Alpha interactivity within Mathematica that Wolfram mentioned earlier.
Google has been my browser's home page for the past seven years, but with the release of WolframAlpha, I may have finally found a worthy replacement. Better yet, if Wolfram or even Google for that matter could integrate its search results with its own, that would cover both the worlds of computational structured data and content-rich unstructured web pages, thereby truly bridging the world's static, unstructured reference information with dynamic, structured computational data under a simple one-line query interface.