Channels ▼

Mark Nelson

Dr. Dobb's Bloggers

More on XML

March 22, 2009

Bil Lewis has had a thing or two to say about why isn't overly gaga about XML. Standing up to the XML orthodoxy can be a bit dangerous, as I'm sure Bil is finding out. When Jeff Atwood had the temerity to beef about XML in this piece  on his blog, he had to deal with comments like this:

I expected more from you, Jeff. Sad to see such a talented programmer express something so idiotic. 


Add Me To the List

Despite the  possibility that I'm going to get a plaque in the hall of shame, I have to chime in and say I'm not all that happy with XML either.

I recently started automating some of the inventory for my non-profit venture, Reading With Conviction (please donate) by adding a handheld scanner that can read ISBN numbers from books in order to quickly enter them into a database.

It turns out that I can get my hands on book data pretty quickly using a simple HTTP query to, a free service that seems to have a fairly complete compendium of books in print.

 When I make a query for information about a book, as you might expect I get back a response in XML:

<ISBNdb server_time="2009-03-19T02:01:00Z">
<BookList total_results="1" page_size="10" page_number="1" shown_results="1">
<BookData book_id="the_data_compression_book" isbn="1558514341">
<Title>The Data Compression Book</Title>
<AuthorsText>Mark Nelson, Jean-Loup Gailly, </AuthorsText>
<PublisherText publisher_id="m_t_books">M&amp;T Books</PublisherText>

All I really want out of this is the Title element - which should be an easy enough task,right?


Sorry, Sir, This Is the Line for C++ Programmers

There may be easy ways to work with XML documents, but using Visual C++ and MSXML don't seem to give me the elegance that I am hoping for.  In pseudocode, getting the title seems to take 10 steps, something along the lines of this:

  • Create an XML Document Object
  • Load the XML data into the Document Object
  • Get a list of nodes with the name "Title"
  • Get the length of that list and make sure it is 1
  • Get the first node from that list
  • Get a list of the child nodes of that Title Node
  • Make sure the list of child nodes has just a single element
  • Get the first node from the list of child nodes
  • Make sure that the first node is of type text
  • Get the text from that child node, which is the book Title

  This Herculean set of tasks is complicated by a few additional points, including:

  • The function calls are all in COM, which turns the mundane into the difficult at every step.
  • Every function can return an error, which means backing out of this long path of object creation.
  • The string objects returned by the COM calls are BSTR objects, which are difficult to work with

 All I really want is to make one function call, something like GetValue( "Title" )

The "Not For Humans" Straw Man

A lot of people try to explain away the difficulties of using XML by saying something along the lines of "It's for machines to use, not people."

Well, this is true, but if machines are going to use it, people generally have to write the code, right? And that's me. So it's not much of a dodge 

I appreciate the precision that XML gives me, and I appreciate its flexibility, somewhat. And maybe it's just that I'm working in C++. But still... for simple configuration information, INI files used to be pretty nice. I miss them.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.