What is XML good for, anyway?
March 12, 2009
What is XML good for, anyway?
I've never quite figured out XML.
I mean, it's obviously a serialization format for objects, but there
are lots of serialization formats. Why should I care about this
It's semi-human readable/writable.
For very small files, you can read it directly, but it's painful and
very inaccurate (ie, it's unlikely you'd notice a typo). There are
some editor packages that will do a little bit of formatting and error
checking for you, but they don't do much. And that's OK with me,
because I have no intention of writing them by hand.
So, what is it good for?
I don't see it as a persistance format. We have object databases and
object front ends to relational databases (eg, Hibernate) that work
As a serialization format, it... isn't very impressive. The format is
very bulky and... I don't think there's anything else to be said. What
else do I care about?
In a vague sense it has a small advantage when dealing with different
programming languages. It'll be easier to debug than a binary format I
suppose. But I don't intend to get into the XML debugging business. I
just want to serialize my objects and read 'em back in.
The RMI serialization format works fine for me. I don't know anything
about it and I don't have to. And that is exactly what I want out of
my serialization functions. I never want to see the format. I just
want it to work, so I can spend my time analyzing Ribosomal genes.
With XML, I don't get the automatic generation of reader/writer
either. I have to go in and write it myself. Huh? The XML parsers I've
seen expect you to do string searches for keys, while in RMI I simply
access an instance variable.
String searches: for example I might write
Node node1 = doc.findNode("Tank");
Tank tank1 = Tank.convertXML(node1);
Tank tank1 = configuration.getBattalion(0).getTank(0);
This does not fill me with warm fuzzies.
I notice that XML is a big favorite in the configuration file world. I
don't quite understand why it's so popular there. And I don't quite
understand why they have so much configuration anyway.
I wrote an application server for a class I was teaching at Tufts a
couple of years ago. It required exactly ZERO configuation--no
configuration files, no annotations, nothing. You wrote your
application, jar'd it up, put in the server's application directory,
and it ran.
So I have no idea why Tomcat wants all that redundant (?) information.
But I'm getting off the point.
Let's say we do want a configuration file. Would XML be a good format
What do I want from a configuration file?
I want it to encode some data. We can think of it as being a single
object. (I actually parse my configuration files directly into
Configuration objects and then pull values from there.)
Do I want it to be human readable/writable?
If I never looked at it with a text editor, I wouldn't care.
But I do want to edit it directly. That's the whole point. I want to
be able to change values in the file. I may well have a pile of
similar files that I want to change in a uniform fashion and using
EMACS beats the heck out of some specialized configuration editor,
where I might have to edit each of my 600 configuration files
So, human readable/writable is valuable here.
Is XML the way to go?
Here's a typical configuration file I used for an AI testbed:
Dimensions: 500, 500
# One King Tiger against 6 Shermans is about even.
Tank: Sherman Tactics1 6
Tank: Grant Tactics2 10
Tank: PKW_4 Tactics1 5
Tank: King_Tiger Tactics6 1
The same information in XML would look about like this:
<?XML header stuff>
<! One King Tiger against 6 Shermans is about even.>
So which one would you rather edit?
(I trust you noticed there was one typo in each file?)
So if XML isn't good for configuration either, what is it good for?
Is there something major that I'm missing??