Laying the Foundation: Revolution, Math for Databases and Big Data
Databases and American democracy both started with someone applying intellect to lay a foundation. Today I find the efforts of two whose intellectual endeavors laid a foundation are in need of advocacy, so here we go. One foundation-layer, David Childs, published important computer science research in the 20th century. The other, Thomas Jefferson, was an 18th century political thought leader, a Founding Father of a new republic, and the author of many important documents.My previous blog posting, "Sets, Data Models and Data Independence," provided an introduction to David Childs' research on extended set theory. Childs' work has gotten attention from those who are interested in performance-oriented solutions for processing large data sets, including the gigabyte-sized data sets used for TPC-H benchmarks.
Before discussing that, it's time to launch into a complaint about historical revisionism. It seems to be in play concerning Childs' research as a key contribution to the relational model. Another example of revisionism is excising Thomas Jefferson's writings, part of the foundation of American democracy, from history studies. This is a position recently advocated by the Texas Board of Education.
The Board of Education has been reviewing the Texas social studies and history curriculum with the goal of seeing changes to textbooks purchased for Texas schools. Because Texas buys a large quantity of textbooks, it has considerable influence on the textbook market and the changes sought by the Texas Board have national implications. A majority of the board members have apparently decided that Thomas Jefferson is not a model they want to use in teaching the history of America's Founding Fathers. They have decided to exclude Jefferson's writings from the history taught in Texas schools. How they intend to treat authorship of the Declaration of Independence should be interesting!
Unlike the Texas episode, I'm not suggesting that Childs has intentionally been excised from the literature about the relational model and its mathematical underpinnings. The historical revisionism in this case has likely been an error of omission, not commission.
Dr. Edgar Codd cited Childs' research in his seminal 1970 paper "A Relational Model for Large Shared Data Banks". But how often have we seen Childs linked with the concept of a formal mathematical foundation for managing data? How often have relational advocates pointed to Childs as contributing to the relational model and the notion of data independence?
Perhaps the oversight is because Childs' research did not get the exposure of Codd's paper. Codd's 1970 paper was published in the Communications of the Association for Computing Machinery (CACM). Childs' papers saw limited dissemination because they were publications of the Research in the Conversational Use of Computers (CONCOMP) project at the University of Michigan.
The CONCOMP project was funded during the 1965-1970 timeframe. The project started when timesharing, interactive, and conversational computing were on the bleeding edge. Besides Childs' research on data structures, CONCOMP included research on data concentrators, audio response, computer graphics and computer-aided design (CAD). CONCOMP produced an extensible version of the Michigan Algorithm Decoder (MAD) language. MAD/I supported user-defined types, a revolutionary concept at the time.
In order to meet its research objectives, CONCOMP also had to do other pioneering work, such as developing a hardware interface to support remote terminals connected to an IBM mainframe. The research on data concentrators and communications protocols fed into development of the ARPAnet, forerunner of the Internet. Childs' work fed into development of the relational model and SQL databases.
In the 1960s, most software was bundled with computer hardware or developed by government funding. The IBM unbundling decision of 1969 created the opportunity for software to become a multi-billion dollar business. But in the 60s, government funding was a primary source of revenue for innovation by software houses and universities. Guidelines and software development standards from the government carried a lot of weight if you wanted to stay in business.
CONCOMP was sponsored by the Department of Defense Advanced Research Projects Agency (ARPA, known as DARPA today). ARPA funded CONCOMP during a time when automated theorem proving and computational logic were topics of interest, as eventually was mathematical proof of correctness for programs.
The CONCOMP papers were unclassified but not freely available. Although Codd cited one of Childs' papers, I wonder how many people were able to read Childs' research at that time. The CONCOMP papers carried a restriction that they were available only to a 'qualified requester'.
Further reading of historical interest: