Ken North

Dr. Dobb's Bloggers

Legislation, Tracking Applications and XML Plumbing

January 25, 2011

The eXtensible Markup Language (XML), and databases that store XML, play a significant role in systems for authoring and tracking legislation. XML provides plumbing and XML servers provide an enabling technology for applications that lawmaking bodies use - and for web sites that report on lawmakers and legislation.It's been 13 years since XML 1.0 arrived on the scene and caused us to take a closer look at storing and querying unstructured information. Jonathan Robie, Daniela Florescu and Don Chamberlin (co-inventor of SQL) developed the Quilt query language that soon became the departure point for the W3C XML Query Language Working Group to create XQuery 1.0.

After release of the XML 1.0 specification, there was a wave of XML adoption for data interchange, data integration and messaging. Also after the W3C published XML 1.0, there was a surge of interest in a new class of data stores commonly referred to as 'XML servers' and 'native XML databases' that stored an XML document as a unit.

Amongst the software products that can store XML and query it via XQuery is the MarkLogic Server, which is now in release 4.2. MarkLogic has been on my radar for years because the company's Principal Technologist, Jason Hunter, was a popular speaker at a series of conferences I chaired. I've watched the company emerge as a leader in a space that was populated by dozens of open source and commercial XML data stores, and eventually XQuery engines.

Among the companies that have successfully ridden the XML wave are those building legislative applications. One such company, Propylon, created one of the first XML-based legislative systems for the Irish Parliament. Propylon CTO Sean McGrath was an invited expert of the W3C group that defined the XML specification. Propylon has adopted Legislative Open Document Format (LODF) that applies domain-specific document semantics atop the Open Document Format XML standard (ISO/IEC 26300:2006).

Another important player in the legislative space is CQ Roll Call, a news organization that has built several applications using the MarkLogic Server. CQ Roll Call is an Economist Group news organization that produces several publications, including Roll Call, Congressional Quarterly and Congress.org. It provides facts and analysis about elections and the legislative process, with about 180 staff members covering the US Congress.

The first Roll Call application that used MarkLogic Server was an integration of the Public Laws database and U.S. Code with the ability to track pending legislation. The second application built on MarkLogic provides closed captioned video from the House and Senate floor.

An application rolled out this month enables users to go to the CQ Roll Call website for a variety of unstructured information, including hearing transcripts and videos of Congress in action. This new application supports more than 30 different sources of unstructured information. Besides enabling users to read a daily briefing, news and analysis, it also tracks legislation and the voting record of members of Congress. The web site also offers Knowlegis, an interesting advocacy tool for tracking interactions with Congress.

Knowlegis advocacy tool screen shot

A decade before 'Big Data' and 'NoSQL' entered the lexicon of the software community, there were developers attacking the problem of storing and serving up massive collections of documents and other unstructured information.After the W3C published XML 1.0, there was a surge of interest in a new class of data stores commonly referred to as 'XML servers' and 'native XML databases' that stored an XML document as a unit.

Amongst the software products that can store XML and query it via XQuery is the MarkLogic Server, which is now in release 4.2.





Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

DrDobbs encourages readers to engage in spirited, healthy debate, including taking us to task. However, DrDobbs moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. DrDobbs further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Best of the Web

What the New iPad and iOS 5.1 Mean for Developers

The new display is gorgeous. But local storage for HMTL5 is currently broken on the new iPad and performance of some apps is slower. Here's a deep dive into the issues, including benchmarks and analysis.

Quick Read

Triple Buffering as A Concurrency Mechanism

Triple Buffering is a way of passing data between a producer and a consumer running at different rates. It ensures that the consumer sees only complete data with minimal lag.

Quick Read

Embedding GDB Breakpoints in C Source Code

Have you ever wanted to embed GDB breakpoints in C source code? Something like this:
printf("Hello,\n");
EMBED_BREAKPOINT;
printf("world!\n");

Quick Read

Writing Kernel Exploits

Why attack the kernel? Because it has a huge attack surface with potential for very interesting bugs. This presentation (pdf) takes a code-level dive into recently reported Linux-kernel exploits.

Quick Read


More "Best of the Web" >>



Video

Enabling People and Organizations to Harness the Transformative Power of Technology