An Expat TSAXParser Implementation
The TSAXParser component comes in two flavors: the one implemented on top of MSXML, and another with the same name, properties, events, methods, and behavior, but implemented on James Clark's Expat library (http://www.jclark.com/bio.htm).
To use the Expat-based component on Windows, you have to download the WIN32 binary of expat.dll at http://sourceforge.net/projects/expat/ (you can also download the source there and build the DLL yourself with the MSVC6 compiler).
Before you can use the Expat DLL in Delphi, you have to translate the C header file expat.h to Pascal. I first used Bob Swart's Headconv 4.0 on expat.h to make a first cut of expat.pas, and then went in for some hours of serious hand-editing the result (correcting the translation errors made by Headconv, and reformatting the code to make it more readable).
I then reimplemented the TSAXParser component using the C functions and callback routines exposed by the Expat library. This was straightforward, with a couple of exceptions.
The "Element Declaration Handler" implementation proved interesting, because here I had to actually free memory in Delphi that had been previously allocated by Expat. Luckily, James Clark provided for this by letting you specify your own memory allocator to be used by the parser. You do this by creating a new instance of the parser with the XML_ParserCreate_MM() function, which has an argument that is a structure containing pointers to memory allocation functions that implement equivalents of malloc(), free(), and realloc(), in my case using the Delphi memory allocator functions GetMem(), FreeMem(), and ReallocMem().
Also, Expat is a SAX1 parser (with extensions in the current version), not a SAX2. To remain compatible with the MSXML version, I implemented some MSXML behavior in TSAXParser, like the way namespaces and namespace prefixes are handled, and the reporting of attribute types. I decided to keep it simple, and just maintain a couple of lists built by the element declaration and attribute declaration handlers. Later on in the parsing process, the element handlers can look up this data to pass it on to the application as needed.
As I do not (yet) have a copy of Borland's Kylix (described as "Delphi for Linux"), I could not test the component on Linux, but it should run virtually unmodified with Kylix (the references to expat.dll will have to be changed to expat.so, I guess).
Expat not only proved superior in performance to the MSXML parser, it also parsed without a hitch a couple of valid XML documents that caused MSXML to throw an OLE exception.
There is only one SAX2 event that I have not implemented yet in the Expat version the "Unparsed Entity Handler." Complete source code for the Expat component and the same example applications that come with the MSXML version are available electronically; see "Resource Center," page 5.
Finally, I could also have used the C++ Xerxes XML parser (by the Apache XML group, http://xml.Apache.org/), which features COM interfaces to make it compatible with MSXML on Windows, but I doubt that these would be usable in Kylix. And linking C++ code with Object Pascal is complicated by the C++ name mangling, so I preferred to use Expat.
D.H.