Using XML Schema for Validating User Input

Capturing and validating user input


January 02, 2008
URL:http://www.drdobbs.com/database/using-xml-schema-for-validating-user-inp/205207007

Alex is a technical lead for the Nokia Siemens Network. He can be contacted at [email protected].


One of the often repeated (and mundane) tasks in software development is capturing and validating user input. Nevertheless, these are important tasks, and the most common method of accomplishing them is with if-else statements in the code. However, this approach leads to code bloat and inflexible code, not to mention it is hard to unit test.

Another approach is to use something like the Apache CLI (Comand-Line Interface), although here the fields and parameters have to be input in the code, again leading to inflexible code and situations where the parameters or acceptable range/values changes have to be updated and recompiled.

However, using an XML Schema for validation decouples the code from the validation task completely. For example:

1. The user enters a command via the GUI/CLI to create a "route" between two "servers":

CR_ROUTE source="132.186.69.61" dest="132.186.69.61" vrf_bit="0" name="Server1" source2="132.186.69.61" prim_mask="132.186.69.61"

2. This is coverted to XML (with the schema embedded):

<CLISYNTAX xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="CLISyntaxData.xsd">
  <CR_ROUTE  prim_addr="132.186.69.61" dest="132.186.69.61"
vrf_bit="0" name="Server1" source2="132.186.69.61" 
prim_mask="132.186.69.61"/>
</CLISYNTAX>

3. This XML is validated against the schema CLISyntaxData.xsd using, for instance, Xerces. Listing One is an excerpt of the schema.

<?xml version="1.0" encoding="UTF-8"?>
<!--  Command database for user syntax validation -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
    <xs:element name="CLISYNTAX">
        <xs:complexType>
            <xs:choice>
                <xs:element name="CR_ROUTE">
                    <xs:complexType>
                        <xs:attributeGroup ref="CR_ROUTE_Attrs"/>
                    </xs:complexType>
                </xs:element>
                <xs:element name="DISP_ROUTE">
                    <xs:complexType>
                        <xs:attributeGroup ref="DISP_ROUTE_Attrs"/>
                    </xs:complexType>
                </xs:element>
            </xs:choice>
        </xs:complexType>
    </xs:element>

<!-- Command Line attributes for CR_ROUTE command -->
   
    <xs:attributeGroup name="CR_ROUTE_Attrs">
        <xs:attribute name="dest" type="IPaddr" use="required"/>
        <xs:attribute name="name" type="xs:string" use="required"/>
        <xs:attribute name="prim_addr" type="IPaddr" use="required"/>
        <xs:attribute name="prim_mask" type="IPaddr" use="required"/>l;
        <xs:attribute name="sourc2e" type="IPaddr" use="required"/>
        <xs:attribute name="vrf_bit" type="xs:integer" use="required"/>
    </xs:attributeGroup>
   
    <!-- Command Line attributes for DISPVRF command -->
   
    <xs:attributeGroup name="DISP_ROUTE_Attrs">
        <xs:attribute name="vrf" type="xs:integer"/>
    </xs:attributeGroup>

    <!--  Derived types for syntax validation -->

    <xs:simpleType name="IPaddr">
        <xs:restriction base="xs:string">
            <xs:pattern
value="((1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])\.){3}(1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])"/>
        </xs:restriction>
    </xs:simpleType>

</xs:schema>
Listing One

The advantage of this approach is:

Now let's see how you can do this using Java. Here, I illustrate the same approach by getting the input from the command line. The command-line input follows the pattern fieldname<space> value; and use pattern matching via regular expressions to extract the fields. Listing Two is an excerpt of the main method:

public static void main(String[] args)
    {
         String input="";
        for (int i=0;i<args.length;i++){
            input = input+" "+args[i];
        }
        input = input.trim();
        commandLineTokenizer regToken = new commandLineTokenizer();
        regToken.parseCmdline(input);
Listing Two

You can see how the code works in parseCmdLine method below. Since I am using regular expressions, I use these Java packages:

import java.util.regex.Matcher;
import java.util.regex.Pattern;
public void parseCmdline(String inputStr) {

You first declare a string with the correct regular expression:

          
String patternStr1 = "-(\\S++)\\s*(\\S+)\\s*";
String patternStr2 = "(\\S+)\\s*=\\s*(\\S+)\\s*";


Predefined Character Classes
. Any character (may or may not match line terminators)
\d A digit: [0-9]
\D A non-digit: [^0-9]
\s A whitespace character: [ \t\n\x0B\f\r]
\S A non-whitespace character: [^\s]
\w A word character: [a-zA-Z_0-9]
\W A non-word character: [^\w]

 Quantifiers
 Meaning
 Greedy  Reluctant  Possessive
 X?  X??  X?+  X, once or not at all
 X*  X*?  X*+  X, zero or more times
 X+  X+?  X++  X, one or more times
 X{n}  X{n}?  X{n}+  X, exactly n times
 X{n,}  X{n,}?  X{n,}+  X, at least n times
 X{n,m}  X{n,m}?  X{n,m}+  X, at least n but not more than m times

Also note that I used the regular expression grouping methodologies.

Once you create the pattern, you compile it:

// Compile regular expression
Pattern pattern1 = Pattern.compile(patternStr1);
Pattern pattern2 = Pattern.compile(patternStr2);
//Pattern pattern3 = Pattern.compile(patternStr3);
//To get all the flags except (-)
// and the CLI command
Matcher matcher = pattern1.matcher(inputStr);

The flags and values will be in two groups that are then printed out:

while(matcher.find())
  {
      if(matcher.group(1).compareTo("i")==0)
      {
          System.out.println("Command=" + matcher.group(2));
          szCommand=matcher.group(2);
          continue;
      }
      System.out.println("flag=" + matcher.group(1));
      System.out.println("flagValue=" + matcher.group(2));
      cmdlineParamMap.put(matcher.group(1),matcher.group(2));

Once the flags and values are separated, it is easy to create XML from them:

public  boolean createXMLFromInput(String filename)
   {
     try {
      BufferedWriter out = new BufferedWriter(new FileWriter(filename));
        out.write("<?xml version=\"1.0\"encoding=\"UTF-8\"?>\n");
        out.write("<CLISYNTAX  xmlns=\"http://www.scs.org\>\n");
        out.write("<"+ szCommand +" ");   
               //CR_ROUTE  source="132.186.69.61"
dest="132.186.69.61" vrf_bit="0" name="Server1" source2="132.186.69.61" 
prim_mask="132.186.69.61"
              //now write the paramete and the values
              szCommandLineString =szCommand +" ";
              // Iterate over the keys in the map
              Iterator itParam = cmdlineParamMap.keySet().iterator();
                     while (itParam.hasNext())
                    {
                        // Get key
                        String param = itParam.next().toString();
                        out.write(param+"=");
                        szCommandLineString +=param+"=";
                        String value =cmdlineParamMap.get(param).toString();
                        out.write("\""+  value + "\"" + " ");
                        szCommandLineString +="\""+  value + "\"" + " ";
                      }
             out.write("/>\n");
             out.write("</CLISYNTAX>\n");
             out.close();

Once the XML is created from the input and using Apache Xereces DOMParser, the XML can be validated with the schema. I use import org.apache.xerces.parsers.DOMParser; for this functionlity:

DOMParser domParser = new DOMParser();
regToken.createXMLFromInput("tempfile.xml");
CustomErrorHandler handler = new CustomErrorHandler();
   try { domParser.setFeature("http://xml.org/sax/features/namespaces",true );
 domParser.setFeature("http://xml.org/sax/features/validation",true );
 domParser.setFeature("http://apache.org/xml/features/validation/schema",true
) ;
 //domParser.setFeature("http://apache.org/xml/features/validation/schema-full-checking",true
);
 domParser.setFeature("http://apache.org/xml/features/continue-after-fatal-error",true
) ;
 domParser.setProperty("http://apache.org/xml/properties/schema/external-schemaLocation","http://www.scs.org
CLISyntaxData.xsd" );
             domParser.setErrorHandler(handler);

Now the ever important method call:

domParser.parse("tempfile.xml");

In case of an error, this is caught in the error-handler object of the CustomErrorHandler class that was passed to DOMParser:

if(handler.getError() > 0)
{
  System.out.println("Error in Parsing-Invalid Input" );
  System.out.println("---------------------------------------------");

Using pattern matching, you can get a user-fiendly error message out of the cryptic error message of the SAXParseException object:

handler.getUserfriendlyErrorMsg();
if (handler.getColumNumber() > 0)
{
String cmdline=regToken.getParsedCommandLineString();
int len = handler.getColumNumber() ;
System.out.print(cmdline.substring(0,len-2));
System.out.print("**-->");
System.out.println(cmdline.substring(len-2));
}

That's it. You can find the complete SchemaParser source code here.

Assuming you're on Windows, the environemt varibles to set are:

Set
CLASSPATH=E:\Binaries\JARS\xerces-2_9_0_jar\xercesSamples.jar;E:\Binaries\JARS\xerces-2_9_0_jar\xercesImpl.jar;E:\Binaries\JARS\xerces-2_9_0_jar\serializer.jar;E:\Binaries\JARS\xerces-2_9_0_jar\resolver.jar;E:\Binaries\JARS\xerces-2_9_0_jar\xml-apis.jar;"E:\Program
Files\Xerces-J-bin.2.9.0.tar\xerces-2_9_0";"E:\Program
Files\Java\jdk1.5.0_03"\lib;"E:\Program
Files\Java\jdk1.5.0_03"\jre\lib;"E:\Program
Files\Java\jdk1.5.0_03";"E:\Program
Files\JacORB-2.2.4"\lib;".";;xml-apis.jar;;E:\Binaries\JARS\xerces-2_9_0_jar;

The invocation of the Java class is:

\JavaXML\SchemaParser\bin>java -classpath %CLASSPATH% java_xml/schemaparser

and the sample input and corresponding output are:

\JavaXML\SchemaParser\bin>java -classpath %CLASSPATH%
java_xml/schemaparser "-n neName -g ugname -l fsdf,fdsfsd,fsdfsdf -i
CR_TUNNEL prsim_addr= 132.186.69.255 dest=132.186.69.61 vrf=1 
ggsn=String source=132.186.69.61 prim_mask = 132.186.69.99 dsasd=0"
Error in Parsing-Invalid Output
---------------------------------------------
Error= Attribute 'prim_addr' must appear on element 'CR_TUNNEL'.
Error= Attribute 'prsim_addr' is not allowed to appear in element
'CR_TUNNEL'.
Error= Attribute 'dsasd' is not allowed to appear in element 'CR_TUNNEL'.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.