Channels ▼

Bil Lewis

Dr. Dobb's Bloggers

Teaching Databases

July 14, 2008

As low man on the totem pole at Tufts, it was my honor to teach all
the database courses while I was there. (Don't get me wrong, I also
got to teach some totally cool classes of my own invention. Tufts was
wonderful to be at. I highly recommend it.) I inherited the book used
by the previous guy (who was very happy to leave it). It was a very
fine book in many ways, but I realized that it was so focused on the
details of how to make a database engine run fast, that it missed the
main point about why we want a database in the first place.

We want a database so we can persist information that we need to run
our business or lab or whatever. Anything that does not contribute to
that goal is worthless. No surprise there.

Classic Database courses (like the one my book was written for) focus
almost exclusively on issues of how to make relational database
queries run quckly. And so the major topics are query translation,
indexing, sorting, concurrency optimizations, ACID, etc.

These are all good things and should be understood by anyone getting a
degree in Computer Science. My complaint with this approach is that
the fundamental reason for having a database gets lost and we end up
with systems that store lots of data very efficently, but don't
actually do what we want them to.

So that was my big push in my classes--what do we want?

The answer to that question (as I'm sure you're aware, but that other
guy isn't, so I'm explaining for him) is "We want to persist some of
the data used in our applications." And all that stuff about
correctness, efficency, ACID, etc., is just detail subsumed by our
basic objective.

We want to persist data.

The data we want to persist is the data that our applications
need. And the form of that data is whatever form works best for the
application. 

Applications.

Much of the complication in database design comes from having multiple
applications that all want to share the same data, but they don't use
it in the same way.

You have a customer database that needs to record things like shipping
addresses and invoices. Great.

You have an employee database that needs to record things like health
care plan and home address. Also great.

Now one of your employees starts buying stuff from your
company. Wonderful!

The day comes when you want to send a mailing to all customers and all
employees. No need to send two mailing to the same person. How do you
know if customer Jane and also employee Jane? 

Are there two separate tables? If not, do you include all attributes
for all people, even if most of those attributes won't be used? If you
have two tables, how do you decide on which address or phone number to
use? If a request to change address comes in, do you change both
addresses? And these are the easy questions.

These are the issues that keep DBAs up at night.

Our book noted this issue, described the two options, then left us
without guidance as to how to actually make them work. To this day I
only have a vague idea how I would deal with this in the real world.

As part of my class, I assigned the students to "go forth and
interview a real DBA." Too much book learning ain't good for a
man. Somebody famous said that.

I have no qualms about asking students to distinguish between B-trees
and hashtables, nor to calculate precise performance figures for
both. But I don't expect to keep these details in their heads for the
rest of their lives. I expect them to understand what's important from
the application's point of view, and have a vague idea of the details,
secure in the knowlege that if they ever needed it again, they know
how figure it out.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video