Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Security

Securing XML


Mar02: Securing XML

Amir is CTO of NewGenPay and he can be contacted at [email protected].


The eXtendible Markup Language (XML) lets richly structured documents be displayed on the Web in a common format and provides common interoperable markup for document exchange between organizations. XML is widely used for electronic business and commerce protocols over unsecured networks such as the Internet and mobile (wireless) networks. However, security is extremely important for many applications using XML and requires:

  • Confidentiality. Hiding parts or all of the communication from eavesdroppers.
  • Authentication and integrity. Ensuring that messages are received exactly as sent, with proper identification of the sender and without replays or reordering.

  • Authorization. Ensuring that requests/operations are done only by authorized parties.

  • Nonrepudiation. Providing auditable proofs of message transmission, including proof of origin and delivery.

  • Clogging prevention. Ensuring that (computationally intensive) public key operations, and other resource-consuming operations, are limited to identified parties.

In this article, I examine several proposed XML security mechanisms, focusing on the joint IETF/W3C emerging standards of XML Digital Signature (DSIG) (http://www.w3.org/TR/xmldsig-core/) and XML Encryption (http://www.w3.org/TR/2001/WD-xmlenc-core-20010626/). Implementations of both are available at http://www.alphaworks.ibm.com/tech/xmlsecuritysuite/. To illustrate how you can use DSIG and XML Encryption, I use as an example NewGenPay — an open, interoperable payment system based on these two standards — which is produced by the company I work for (http://www.newgenpay.com/).

Secure Hashing of XML Objects

An important cryptographic component used in XML signatures is collision-resistant hashing. A collision-resistant hash function maps a variable-length string to a fixed-length string (often 128 or 160 bit), such that it is difficult to find two strings that hash to the same value (a collision). Cryptographic hash functions are required to be one-way; that is, it is hard to find the preimage (input to hash) from the result of the hash. These properties make hashing useful for many nonrepudiation applications, usually in conjunction with public key signatures. DSIG, and some other publications, refer to hashing as message digest.

The Reference object in Listing One (available electronically; see "Resource Center," page 5) has an optional Universal Resource Identifier (URI) attribute, identifying the hashed resource. A reference binds a specific resource (identified by URI) and its content. The DSIG Reference element specifies the hash value <DigestValue> of a resource, and the hash algorithm <DigestMethod Algorithm>, with SHA-1 as mandatory to implement.

Committing to a value by a hash requires the use of a specific octet (byte) stream for evaluation and verification. The optional <Transforms> element specifies one or more transformations that you apply, in sequence, to the resource to create the exact octet stream. The URI defines input to the first transform, and output of each transform is input to the next. DSIG defines several standard transforms: base64 decoding, Xpath filtering, Enveloped Signature (discussed later), XSLT mapping, and canonicalization.

The canonicalization transforms, such as the mandatory to implement Canonical XML (http://www.w3.org/TR/2001/REC-xml-c14n-20010315), produce the canonical form of an XML object. The canonical form remains the same, following any syntactic (but not semantic) modifications to the object. By canonicalizing an XML object before hashing, the hash (and any signature applied to it) remains valid even if you change the document syntax; for example, for namespace resolution or to use a different line ending convention.

XML Digital Signatures (DSIG)

The XML Digital Signature provides the security services of data integrity, authentication, and (optionally) nonrepudiation. Figure 1 shows, in a simplified shorthand notation, the structure of DSIG signatures with its four elements. Elements appear zero or more times if followed by "*", zero or once if followed by "?", and once or more if followed by "+". When not followed by a symbol, elements appear only once. I removed attributes and contents in this notation. Listing Two (also available electronically) is an example of a signature object using three of its four elements.

The signature object contains the cryptographic hash (digest) of any signed information, and a reference to the information itself. The signed information may be an arbitrary document (handled as an opaque sequence of bits). However, more often, it will be an XML object (a complete document or fragments of a document). The ability to sign only specific elements of XML documents is one of the most important features of DSIG. It lets the unsigned parts of the XML document be enhanced, modified, or removed (for privacy or efficiency), keeping the signature valid.

DSIG signatures may either contain the signed XML object (enveloping) contained in the XML object (enveloped), or detached from the signed object or document. When the signed XML object envelopes the signature, the enveloped signature value itself is not included in the signature calculation and validation computation. For this, you use the enveloped-signature transform, removing the whole Signature element in which it is contained from the digest calculation.

Public key digital signatures that provide nonrepudiation, such as RSA, are computationally intensive operations; therefore, DSIG also allows shared-key authentication (MAC-Message Authentication Code) that provides authentication, but not nonrepudiation. Collision-resistant hashing of the signed content is also used to save computational requirements. Figure 2 illustrates the process for creating DSIG signatures.

An XML DSIG <SignedInfo> aggregate may contain multiple Reference elements in the same document; see Figure 3. This lets the same signature sign (authenticate) multiple resources. The main motivation for this is efficiency — a single public key signature (over the <SignedInfo> element) provides authentication, integrity, and nonrepudiation for all of the Reference elements in the <SignedInfo>. DSIG supports this by including a Reference to the optional Manifest object. The Manifest object is simply a list of one or more Reference elements. Validity of the references is the responsibility of the application. The Manifest element allows application-controlled validation of signed resources, where the application decides which resources must be available and valid. The Manifest element may be included in the Signature element by placing it in an Object element.

Validation of DSIG signatures involves validating each reference (against its hash, <DigestValue>), and then validating the signature over the entire <SignedInfo> element. Figure 4 illustrates this process. Validating the hash of the references is efficient because it detects any unintentionally invalidated resources. It is therefore preferable to perform this process before the computationally intensive validation of the public key signature to prevent clogging attacks.

XML Encryption

XML Encryption defines a process for encrypting plaintext data, producing ciphertext, and decrypting the ciphertext to retrieve the plaintext data. It also defines the XML syntax to represent the ciphertext and information needed for the intended recipient to decrypt the ciphertext. The plaintext may be an arbitrary file, an entire XML document, or a fragment of an XML document. Only two forms of fragments of XML documents are allowed — an XML element or XML element content. Figure 5 shows the structure of the <EncryptedData> element in simplified notation. It contains three elements: <EncryptionMethod>, defining the encryption algorithm; <KeyInfo>, providing or identifying the encryption key; and <CipherData>, providing the ciphertext. Both <EncryptionMethod> and <KeyInfo> are optional — the sender and receiver may agree on the encryption method and key in advance. Several elements use the definitions from DSIG, as indicated by use of the "ds:" namespace; for example, <ds:KeyInfo>. I used the wildcard character "*" in <ds:*> to identify any one of several possible additional elements that DSIG allows to appear in KeyInfo, such as <X509Data>, mostly used to provide a public key. Listing Three (available electronically) is an example of an XML Encryption object.

XML Encryption follows the process illustrated (slightly simplified) in Figure 6. If the recipient does not know the decryption key in advance, then the sender generates and sends it. You protect the key in transit by encryption using the EncryptedKey object or key agreement (via <AgreementMethod>). Either can be sent inside the EncryptedData's <KeyInfo> element or referenced from <KeyInfo> using <RetrievalMethod>.

If the plaintext data (to encrypt) is an XML element or content, you encode it using UTF-8 and perform any necessary transforms to it; otherwise, if it is an external resource, you simply consider it as an octet sequence. You then encrypt the data, creating CipherValue, which you place in (or reference from) EncryptedData.

Care must be taken when signing content that may later be encrypted; clearly, the content must be restored to exactly the original plaintext form for the signature to validate properly. To restore the plaintext (decrypt the ciphertext) in the signed content, use the Decryption Transform for XML Signature defined by the XML Encrypt joint W3C and IETF Working Group (http://www.w3.org/TR/xmlenc-decrypt). This transform also allows specification of XML fragments that were encrypted and then signed with the rest of the document and, therefore, are not decrypted to validate the signature. Often, encrypted fragments are removed from the signed information by using the XPATH transform in the Reference element, since the meaningful information is the plaintext (nonencrypted) version. We can sign the plaintext version of an encrypted element by including an appropriate Reference element pointing to it, in the SignedInfo element or in a Manifest element.

When the signed document is confidential and encrypted after being signed, you should also protect against surreptitious forwarding in which the recipient forwards the signed confidential document to a competitor, encrypted by the competitor public key, trying to make it look as if the sender sent the confidential information. To prevent surreptitious forwarding, the signer should append the recipient identities to the document being signed or otherwise prevent surreptitious forwarding (see "Defective Sign-and-Encrypt, by Don Davis," DDJ, November 2001).

Secure XML Transport Protocol (SeXTP)

The Secure XML Transport Protocol (SeXTP) is a simple, efficient, and highly secure transport protocol for client-server applications. SeXTP allows stateless servers and has minimal session setup overhead. It may be useful for many client-server e-commerce applications, especially where the TLS/SSL protocol (see http://www.ietf.org/rfc/rfc2246.txt) is too heavy, requiring substantial session setup overhead and complex implementation. Table 1 compares SeXTP and TLS/SSL. Most services are common to both but, being stateless, SeXTP does not provide freshness or prevention of replay for requests. (Prevention of replays when the server maintains state is easy. In a session protocol such as TLS/SSL, the server sends a challenge, which the client has to return with the request. In a noninteractive protocol, the client sends a request counter with each request.)

SeXTP provides secure transport of application request-response messages with authentication, confidentiality, clogging prevention, and nonrepudiation. The secure transport uses a secret key shared between the parties, used for efficient encryption and authentication (MAC).

SeXTP uses the XML Encryption and DSIG recommendations for providing confidentiality and authentication, including nonrepudiation and integrity.

To encrypt, replace the SeXTP <Protocol> element with the XML Encryption <EncryptedData> element, and identify the key with a <KeyName> identifier. To protect the integrity of the plaintext, namely ensuring that the authentication and signature refer to the plaintext resulting from decryption, we use the Decryption Transform for XML Signature (http://www.w3.org/TR/xmlenc-decrypt). Apart from these minor issues, this is a simple instance of the XML Encryption specification.

To simplify implementations and deployments, SeXTP makes and allows few simplifications, at least compared to the DSIG specification. Most simplifications are by virtue of SeXTP's simple message structure; see Figure 7.

The <Authenticator> object contains the <MAC> and (optionally) <PKSignature> elements authenticating (and optionally signing) the SeXTPtba object. In the basic SeXTP implementation, authentication (and signature) is always on the complete SeXTPtba object. This substantially reduces the requirements from DSIG implementations. In particular:

  • Implementations can ignore the fact that the SeXTPtba object is in XML and sign it as an octet stream before input into an XML parser. In this case, canonicalization is not necessary.
  • DSIG implementations need only the (simple) "detached" mode and not the (more complex) enveloped/enveloping modes.

  • DSIG implementations always sign a single reference (the SeXTPtba part). In particular, there is no need to support the Manifest element.

Another simplification is syntactic. To properly use SeXTP, the server and client must agree in advance on the algorithms (methods) they use. There is no need to send this information with each encryption and signature (or MAC), as is mandatory in DSIG (but not in XML Encryption). Therefore, you make the Method elements (DigestMethod, for example) optional. To distinguish between public key signature and MAC, you define distinct tags <PKSignature> and <MAC> (both derived from DSIG's <Signature>) by making the Method elements optional.

SeXTP Messages

Figure 7 shows SeXTP's message structure. SeXTP messages consist of two distinct XML objects, the SeXTPtba ("to be authenticated") object and the SeXTP Authenticator object.

The <SeXTPtba> object consists of the SeXTP header (SeXTPRQ for requests, SeXTPRS for responses) and <Protocol>, which contains arbitrary, protocol-specific elements. For confidentiality, we encrypt these elements and <Protocol> contains the <EncryptedData> element of XML Encryption.

The <SeXTPRQ> and <SeXTPRS> headers contain SeXTP management information, identifying client and server, for nonrepudiation. In addition, both contain a unique request ID, randomly selected by the client, ensuring freshness.

Response headers also contain unique response IDs. The client returns the last response ID in requests. However, the server does not reject a request whose response ID is not the last one sent, since requests are not necessarily in synchronized sequence. We log the response ID and use it for auditing and monitoring. To provide a limited assurance of freshness for requests, <SeXTPRQ> includes the time per the client's clock as well as the last time received from the server; see Listing Four (available electronically).

SeXTP Shared Key Initialization and Refresh

SeXTP has a simple request-response to initialize and refresh the shared secret session key (k) and provide the public key of each party to the other. These mechanisms are not specific to XML and may be omitted for simple implementation using a fixed shared secret key.

SeXTP uses one of the following two mechanisms to secure key initialization and refresh:

  • Client and server share a secret initialization key.
  • Client knows or can validate the server's public key (for example, via a certificate). In addition, the server may know or be able to validate the client's public key.

With a shared secret initialization code, this code authenticates the request and response (MAC). This efficient authentication lets you identify bogus requests (clogging attempt) before performing any computationally intensive public key operations. When such a code does not exist a priori, SeXTP begins with an unauthenticated "init code request." If the server allows initialization without a secret code, it replies with a hash of the current time and the IP address of the client to be used as the initialization code. The server can then detect clogging attempts per IP address. This resembles the cookie mechanism to prevent clogging in the Internet Key Exchange (IKE; http://www.ietf.org/rfc/rfc2409.txt). A secret initialization key can be used for improved security against clogging attacks and whenever it is practical.

After the initialization key is available, SeXTP establishes a session key, as in Figure 8. SeXTP derives the session key from a key encrypted by the client's public key, combined with a Diffie-Hellman key exchange. For simpler implementations, only one of these mechanisms can be used — encrypted key or Diffie-Hellman exchange. Diffie-Hellman is subject to a man-in-the-middle attack — when the initialization key is weakly protected or when it may be exposed. On the other hand, using Diffie-Hellman ensures forward secrecy even if attackers expose all keys at some time. Decryption of prerecorded communication would be impossible using previously used session keys.

The session key is not used directly, but to derive two other keys — one for encryption and another for authentication. Two keys are used to prevent weaknesses. Each of the keys should be cryptographically independent; that is, exposure of the key use for one purpose should not help in exposing the other. This can be achieved by deriving each of these keys by a pseudorandom permutation. In so doing, attackers cannot tell it from a randomly chosen permutation. By default, SeXTP uses the HMAC function for this purpose (this is a common heuristic).

DDJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.