The Easy Way to Read XML in Java

The Easy Way to Read XML in Java

Feb 18, 2014

In a recent post I showed you how our servers generate XML messages.  This is how we send alerts and other data to you.  Let me continue your tour behind the scenes at Trade-Ideas.  This article shows how our client software reads those messages from the server so it can display your results for you.  As in the previous article, I’m not showing you anything that other people don’t know how to do.  Instead, I’m showing you the easiest way to do it.

Let’s start with the header.  We’ll get to the interesting stuff in a moment.  But let’s make this a complete solution.  You can copy this file and use it as is.
/*
 * Copyright (c) 2014 Trade Ideas LLC – All Right Reserved.
 * This source code is proprietary and confidential.
 * Unauthorized copying of this file, via any medium is strictly prohibited.
 * 
 * $RCSfile: XmlHelper.java,v $
 * $Date: 2014/02/16 19:36:48 $
 * $Revision: 1.15 $
 */
package com.tradeideas.util;

import java.awt.Color;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.StringWriter;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Attr;
import org.w3c.dom.DOMException;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;

Here’s the description of what this code does.  This code is, of course, a perfect match for the code that our server uses to generate XML documents.  The server side is always succinct, never writing any more than it has to.  The client side, shown here, fills in any missing details with default values.  Think about how much more data we’d have to send if every stock was labeled “country-of-listing=USA” and every number was labeled “currency=us-dollars”.  I’m amazed at how many people design systems that way!
/**
 * Our standard XML helper class translated to java.
 *
 * The main point of this is to consolidate the error handling and to make the
 * code shorter and more readable. Most errors return a null or a default value.
 * This is consistent with the way to typically create and parse XML at
 * Trade-Ideas. We will often leave off a complete section of the XML to say use
 * the defaults. This also makes it easier for different versions of clients and
 * servers to communicate.
 *

An “Element” is something like <HTML>…</HTML>.  Our code can treat these all the same way, because they have the same structure, even though the content can vary.  The standard XML tools try to make other things, like PRICE=”4.20″, or <!– this is a comment –>, all the same.  A “Node” is the generic term for all these different types of items.  That was an attempt at an elegant solution that didn’t work out so well in practice.  That’s why my library avoids Nodes and prefers Elements.
 * This library deals almost exclusively with XML Elements, never with XML

Notice the reference to previous versions of this library.  We’ve written a lot of client software.  It all talks to the same servers.  That’s one of the nice things about XML, it can work well with a lot of programming languages.  Notice that this version is slightly better than the previous version.  With experience we’ve really honed our tools.  What’s the opposite of “critical mess”?  Ah, I know.  Our development team is “in the zone.”
 * Nodes. Some previous versions of this library worked with Nodes, but that
 * became sloppy at best and never helped.
 *
 * Examples, differences between Java and C#:
 *
 * <API><!– do it –>
 *   <COLORS>
 *     <BACKGROUND MODE=”RGB”>fljdsfljds</BACKGROUND>
 *   </COLORS>
 * </API>
 *
 * C#
 * Node root = XmlHelper.Get(body);
 * Node child = root.Node(0).Node(1).Node(“colors”).Node(2).Node(“BACKGROUND”);
 *
 * Java
 * Element root = XmlHelper.get(body);
 * Element child = XmlHelper.findElement(root, 1, “colors”, 2, “BACKGROUND”);
 *
 * Notice “Node” -> “Element”, and “0” disappears completely.
 *
 * C#
 * string mode = root.Node(0).Node(1).Node(“colors”).Node(2).Node(“BACKGROUND”).Property(“RGB”);  // or
 * string mode1 = child.Property(“RGB”);
 *
 * Java
 * String mode = getStringAttribute(root, “RGB”, “”, 1, “colors”, 2, “BACKGROUND”); // or
 * String mode1 = getStringAttribute(element, “RGB”, “”);
 *
 * Notice “Property” –> “Attribute”, and the default is no longer optional.
 *
 * @author phil
 */
public class XmlHelper {

    /**
     * Do not try to instantiate this class.
     */
    private XmlHelper() {
    }

Even starting the process is harder than it should be with the standard tools.  This routine is how we initially digest a message from the network.  This is the common case for us.  We use XML for different things, but mostly to send messages from the server to the client.
    /**
     * Parse an XML document.
     *
     * Note that some similar libraries would return the XML document. Instead,
     * we return the document element. We never used the document except to grab
     * the document element. And this allows the other functions in this library
     * to work exclusively with Element objects.
     *
     * @param raw the input. This typically comes from the server over the
     * network. This can be null.
     * @return The top level element from the document. This will be null on
     * error.
     */
    public static Element get(byte[] raw) {
        if (null == raw) {
            return null;
        }
        Document document;
        try {
            document = documentBuilder.parse(new ByteArrayInputStream(raw));
        } catch (SAXException ex) {
            // Invalid input
            return null;
        } catch (IOException ex) {
            // Should we log something?  We don’t expect this to ever happen.
            return null;
        }
        return document.getDocumentElement();
    }

The next function is the main part of parsing an XML document.  The document might have a lot of pieces in it.  We need to find the one piece that we care about right now.

There are a lot of references to null.  What does “null” mean?  It means that there is no value.  This can happen for a lot of reasons.  Think about a top list window.  What if you ask for the 14 day ATR and one stock in your list hasn’t been around for 14 days?  What if you are asking for fundamental data and one of the items in your list is an ETF, rather than a stock?  What if you write a custom formula, and that formula includes a syntax error?  In all three cases, there is no meaningful value.  We could make up a value, like 0.  Or we could completely give up, and send you an error message rather than a top list.  Instead, we record “null” for the value and keep working.  On your screen the null values will get converted to blanks.  These blanks will be surrounded by the values that we could compute.

Notice that it doesn’t really matter why you fail.  What if the client software asks for the background color of the “ATR” field of the 4th row of the 2nd top list?  This isn’t exactly what a request looks like, but it’s pretty close.  Each request is piled on top the the previous one.  This could fail in so many ways.  What if we find the value, but it doesn’t make sense?  I.e. the color was “blew” instead of “blue”.  What if there was only one top list, or the top list had only 2 rows?  What if there wasn’t a field labeled “ATR”?  What if we never received this document at all?  We don’t want to think about each of these separately.  At any step along the way the software can return null to say it has no answer.  The null can get passed along in an orderly fashion to the next step in the process.
    /**
     * Find a specific descendant of the given element.
     *
     * @param start Start looking here. This can be null.
     * @param path This is a series of strings and integers. A string means to
     * look for a child with that name. An integer means to look for the Nth
     * child.
     * @return The resulting element, or null if no such element was found.
     */
    public static Element findElement(Element start, Object… path) {
        for (Object o : path) {
            if (null == start) {
                return null;
            }
            if (o instanceof String) {
                start = getChildElement(start, (String) o);
            } else if (o instanceof Integer) {
                start = getChildElement(start, (Integer) o);
            } else {
                throw new IllegalArgumentException(o + ” is not a String or an int.”);
            }
        }
        return start;
    }

Here’s a smaller piece of the previous function.  This answers requests like “give me the 5th sub-element within this element” or “give me the 5th row in this top list.”

You’d think this would be easy.  But there are issues.  Remember what we discussed in the previous article.  The standard XML libraries are meant to cover a lot of cases for a lot of people.  What if the server wrote the first two rows of my top list, then wrote a comment, then wrote the next two rows?  Does that comment count as a row?  If your goal is to read a document, make one very small change in the document, and then write the document back with as few changes as possible, you need to remember that comment.  My code doesn’t care about that.  So my library will throw away that comment without bothering me.  This is a perfect example where a customized solution is much easier to use than a general purpose solution.
    /**
     * Find a child of the given element. This will automatically skip CDATA,
     * comments, etc. These do not affect the offset.
     *
     * @param start The parent element. This can be null.
     * @param offset Which child to find. 0 means the first.
     * @return The requested child, or null if that child does not exist.
     */
    public static Element getChildElement(Element start, int offset) {
        if (null == start) {
            return null;
        }
        if (offset < 0) {
            return null;
        }
        for (Node child = start.getFirstChild();
                null != child;
                child = child.getNextSibling()) {
            if (child instanceof Element) {
                if (offset == 0) {
                    return (Element) child;
                }
                offset–;
            }
        }
        return null;
    }

Here’s a variation on the last function.  This lets you look up a value by name, rather than position.  That’s one of the things that makes XML so flexible.  That’s one of the reasons why older versions of our client still work with newer versions of our server.
    /**
     * Find a child of the given element.
     *
     * @param start The parent element. This can be null.
     * @param childName Which child to find.
     * @return The requested child, or null if that child does not exist.
     */
    public static Element getChildElement(Element start, String childName) {
        if (null == start) {
            return null;
        }
        for (Node child = start.getFirstChild();
                null != child;
                child = child.getNextSibling()) {
            if (child instanceof Element) {
                Element possible = (Element) child;
                if (childName.equals(possible.getTagName())) {
                    return possible;
                }
            }
        }
        return null;
    }

The previous couple of functions moved from one Element to the next.  This next function looks for an attribute.  This is often the last step as we move out of the XML world.  If an Element looks like <ROW SYMBOL=”ERBB” PRICE=”… />, an attribute looks like SYMBOL=”ERBB”.  At this point we normally get rid of the nulls and replace them with a default value.  As I mentioned before, we use null inside the computer programs to represent a missing value, but we often display that as a blank on the screen.  This is the function where you’d write “” for the default value.

This idea of a default value is important, and we will see more of it soon.  One thing to understand is that the server knows that’s how we’re reading things.  If the server doesn’t know the name of a company, it doesn’t have to say NAME=”” or NAME=”unknown”.  It just doesn’t say anything, and the client knows how to handle that.  The most efficient way to send an unknown value is not to say anything at all.  And in my mind, this is the most elegant solution, too.  Although not everyone agrees, and a lot of people would list out every value they don’t know!

Remember, the main point of my library is to make things easier.  There are a lot of ways to handle an unknown value.  I almost always use nulls to represent these internally, and use a default value at the end.  So the library makes these cases easier for me.
    /**
     * Looks up an attribute of the given element.
     *
     * @param start The initial element. This can be null.
     * @param name The name of the attribute. This cannot be null.
     * @param def The value to return if there is a problem. This can be null.
     * @param path A series of strings and numbers as described in
     * findElement(). The empty list means to look for the attribute in the
     * start element.
     * @return The specified attribute value. If that couldn’t be found, then
     * the value of def.
     */
    public static String getStringAttribute(Element start, String name, String def, Object… path) {
        start = findElement(start, path);
        if (null == start) {
            return def;
        }
        String attribute = start.getAttribute(name);
        // Interesting.  We cannot distinguish between a missing attribute and
        // an attirbute explicity set to “”.  I don’t think that’s a problem,
        // but this is different from our other libraries.
        if (attribute.equals(“”)) {
            return def;
        } else {
            return attribute;
        }
    }

The next routine goes a little further.  XML always stores things as strings, but often we really want a number.  I’ve found it very convenient to look for the value in the document, and convert it into it’s final form, all in one function.  This next function will find a value and convert it into a “double”, i.e. a number with a decimal point.

Notice that we are continuing the theme of converting an error into a default value.  If you can’t find a value, or if you can’t understand the value, you treat those two errors in the exact same way.  Remember all the types of errors we just discussed?  This is one more for the list.

What’s a good default value?  That’s an input to this function.  The right value depends on context.  A recent blog article discussed the best way to look at a company’s cash / debt ratio.  What if a company has no debt?  The default is to treat that as a null and display it as a blank.  But there are alternatives.  Sometimes, if there’s no debt, you want to pretend that the debt is “0.0000001”.  “0.0000001” would be the default in case where there was not real value.
    /**
     * Looks up an attribute of the given element and converts it to a double.
     *
     * @param start The initial element. This can be null.
     * @param name The name of the attribute. This cannot be null.
     * @param def The value to return if there is a problem.
     * @param path A series of strings and numbers as described in
     * findElement(). The empty list means to look for the attribute in the
     * start element.
     * @return The specified attribute, or the value of def if there is a
     * problem.
     */
    public static double getDoubleAttribute(Element start, String name, double def, Object… path) {
        Double possible = getDoubleObjectAttribute(start, name, path);
        if (null == possible) {
            return def;
        } else {
            return possible;
        }
    }

The previous function returns a default value if we couldn’t find or understand the number we were looking for.  This function returns a null, instead.  Mostly we use nulls for the intermediate steps, and mostly it’s convenient to have an actual number at the end.  But it’s good to have options.
    /**
     * Looks up an attribute of the given element and converts it to a Double.
     *
     * @param start The initial element. This can be null.
     * @param name The name of the attribute. This cannot be null.
     * @param path A series of strings and numbers as described in
     * findElement(). The empty list means to look for the attribute in the
     * start element.
     * @return The specified attribute, or null if there is a problem.
     */
    public static Double getDoubleObjectAttribute(Element start, String name, Object… path) {
        String attribute = getStringAttribute(start, name, null, path);
        if (null == attribute) {
            return null;
        }
        try {
            return Double.parseDouble(attribute);
        } catch (NumberFormatException ex) {
            return null;
        }
    }

These next functions do the same thing, but they work with integers.  “1.2” would be considered an error in these functions.
    /**
     * Looks up an attribute of the given element and converts it to an int.
     *
     * @param start The initial element. This can be null.
     * @param name The name of the attribute. This cannot be null.
     * @param def The value to return if there is a problem.
     * @param path A series of strings and numbers as described in
     * findElement(). The empty list means to look for the attribute in the
     * start element.
     * @return The specified attribute, or the value of def if there is a
     * problem.
     */
    public static int getIntAttribute(Element start, String name, int def, Object… path) {
        Integer possible = getIntegerAttribute(start, name, path);
        if (null == possible) {
            return def;
        } else {
            return possible;
        }
    }

    /**
     * Looks up an attribute of the given element and converts it to an Integer.
     *
     * @param start The initial element. This can be null.
     * @param name The name of the attribute. This cannot be null.
     * @param path A series of strings and numbers as described in
     * findElement(). The empty list means to look for the attribute in the
     * start element.
     * @return The specified attribute, or null if there is a problem.
     */
    public static Integer getIntegerAttribute(Element start, String name, Object… path) {
        String attribute = getStringAttribute(start, name, null, path);
        if (null == attribute) {
            return null;
        }
        try {
            return Integer.parseInt(attribute);
        } catch (NumberFormatException ex) {
            return null;
        }
    }

These next functions do the same thing, but they work with Boolean values, i.e. true or false.
    /**
     * Looks up an attribute of the given element and converts it to a boolean.
     *
     * @param start The initial element. This can be null.
     * @param name The name of the attribute. This cannot be null.
     * @param def The value to return if there is a problem.
     * @param path A series of strings and numbers as described in
     * findElement(). The empty list means to look for the attribute in the
     * start element.
     * @return The specified attribute, or the value of def if there is a
     * problem.
     */
    public static boolean getBooleanAttribute(Element start, String name, boolean def, Object… path) {
        Boolean possible = getBooleanObjectAttribute(start, name, path);
        if (null == possible) {
            return def;
        } else {
            return possible;
        }
    }

    /**
     * Looks up an attribute of the given element and converts it to a Boolean.
     *
     * @param start The initial element. This can be null.
     * @param name The name of the attribute. This cannot be null.
     * @param path A series of strings and numbers as described in
     * findElement(). The empty list means to look for the attribute in the
     * start element.
     * @return The specified attribute, or null if there is a problem.
     */
    public static Boolean getBooleanObjectAttribute(Element start, String name, Object… path) {
        String attribute = getStringAttribute(start, name, null, path);
        if (“1”.equals(attribute) || “true”.equalsIgnoreCase(attribute)) {
            return true;
        }
        if (“0”.equals(attribute) || “false”.equalsIgnoreCase(attribute)) {
            return false;
        }
        return null;
    }

The previous routines all navigated through the document looking for one specific item.  This routine allows you to iterate though the items one at a time.  For example when the client receives all the data in the top list, it wants to iterate through the data one row at a time.

This handles errors in the normal way.  If there is any sort of problem, this returns an empty list.  The main program can always say “for each item in the list, do this.”  It doesn’t have to ask is if there is a problem or not.  As with all of these functions, there was already a way to do this.  This function just provides an easier way.
    /**
     * Find all of the children of the given element which are also elements.
     * Throw out any comments, CDATA, etc.
     *
     * @param start The initial element. This can be null.
     * @param path A series of strings and numbers as described in
     * findElement(). The empty list means to look for the attribute in the
     * start element.
     * @return A list of elements. If the parent element can’t be found, this
     * will be an empty list. This will never be null.
     */
    public static List<Element> getChildElements(Element start, Object… path) {
        start = findElement(start, path);
        List<Element> result = new ArrayList<Element>();
        if (null != start) {
            for (Node child = start.getFirstChild();
                    null != child;
                    child = child.getNextSibling()) {
                if (child instanceof Element) {
                    result.add((Element) child);
                }
            }
        }
        return result;
    }

When I look at the details of this next comment, I can see that it was copied directly from the C# code.  Smart programmers are good at stealing a working solution from somewhere else.

While some of the details have changed, the basic idea gets reused a lot.  Since it is easy to return a default value, sometimes we use that mechanism to report an error.  We find some value that we can return as the default, and recognize later and an error.  For example, I often specify -1 as the default user id, since I know that all valid user ids are greater than 0.
    /**
     * Because an attribute returns a Color, not a Color?, there is no obvious
     * way to see if the color exists or not. You can set the default to this
     * then you can see if you got this back. This is not perfect, but it seems
     * reasonable. How many shades of transparent do you need?
     * @return Our standard invalid color.
     */
    public static Color getInvalidColor() {
        return new Color(0x00badbad);
    }

A couple more functions for reading XML:
    /**
     * Looks up an attribute of the given element and converts it to a Color
     * object.
     *
     * @param start The initial element. This can be null.
     * @param name The name of the attribute. This cannot be null.
     * @param def The Color object to return if there is a problem.
     * @param path A series of strings and numbers as described in
     * findElement(). The empty list means to look for the attribute in the
     * start element.
     * @return The specified attribute, or the value of def if there is a
     * problem.
     */
    public static Color getColorAttribute(Element start, String name, Color def, Object… path) {
        return new Color(getIntAttribute(start, name, def.getRGB(), path));
    }

    /**
     * Get all text from within an element. This will be recursive if the
     * element contains other elements. This is basically our standard wrapper
     * around Java’s Node.getTextContent().
     *
     * @param start The initial element. This can be null.
     * @param def The default value. This will be returned in case of any error.
     * @param path A series of strings and numbers as described in
     * findElement(). The empty list means to look for the attribute in the
     * start element.
     * @return The text from within the given element.
     */
    public static String getText(Element start, String def, Object… path) {
        start = findElement(start, path);
        if (null == start) {
            return def;
        }
        try {
            return start.getTextContent();
        } catch (DOMException ex) {
            // This is a strange one.  According to the documentation
            // getTextConent() can fail if the result is too long.  There isn’t
            // much we can do here, so I treat it like any other error.  We
            // really don’t expect to see this.
            return def;
        }
    }

And now some functions for creating and updating XML documents.  Mostly the server creates and writes while the client reads and interprets.  But not always.

The problems are the same for writing as for reading.  The standard libraries give you all the pieces you need, but they are so hard to use.  This library makes the common cases easier.
    /**
     * Create and append a new child.
     *
     * @param parent The newly create element becomes a child of this element.
     * @param name This is the name of the new element.
     * @return The new element.
     */
    public static Element newElement(Element parent, String name) {
        if (null == parent) {
            // For the sake of consistency with the bulk of the methods in
            // this class, we will do nothing if the input is null.
            return null;
        }
        Element result = parent.getOwnerDocument().createElement(name);
        parent.appendChild(result);
        return result;
    }

    /**
     * Set an attribute of an XML element.
     *
     * @param element The element we want to change.
     * @param name The name of the attribute we want to change or add.
     * @param value The value of the attribute.
     */
    public static void setAttribute(Element element, String name, String value) {
        Attr attribute = element.getOwnerDocument().createAttribute(name);
        attribute.setValue(value);
        element.getAttributes().setNamedItem(attribute);
        // The C# version returns the original element.  That was helpful for
        // chaining.  I.e. element.setAttribute(“2”, “ii”).setAttribute(“4”, “iv”).
        // That syntax doesn’t exist in the Java version, so there’s no point
        // in returning anything.
    }

    /**
     * Set an attribute of an XML element.
     *
     * @param element The element we want to change.
     * @param name The name of the attribute we want to change or add.
     * @param value The value of the attribute. setAttribute() will
     * automatically convert this into a string in the appropriate way.
     */
    public static void setAttribute(Element element, String name, double value) {
        // This is not very exciting in Java.  The C# version had to do the
        // conversion to string the right way.
        setAttribute(element, name, Double.toString(value));
    }

    /**
     * Set an attribute of an XML element.
     *
     * @param element The element we want to change.
     * @param name The name of the attribute we want to change or add.
     * @param value The value of the attribute. setAttribute() will
     * automatically convert this into a string in the appropriate way.
     */
    public static void setAttribute(Element element, String name, int value) {
        setAttribute(element, name, Integer.toString(value));
    }

    /**
     * Set an attribute of an XML element.
     *
     * @param element The element we want to change.
     * @param name The name of the attribute we want to change or add.
     * @param value The value of the attribute. setAttribute() will
     * automatically convert this into a string in the appropriate way.
     */
    public static void setAttribute(Element element, String name, boolean value) {
Notice the references to C#.  Our software exists on a lot of clients and servers on a lot of different platforms.  We need to make sure they can all talk with one another.  You might save your layout on a Java platform and reload it in TI Pro, which was written in C#.
        // Multiple values are accepted in getAttribute().  We use the preferred
        // C# values, so this code will act the same way as the C# version of
        // XmlHelper.
        setAttribute(element, name, value ? “True” : “False”);
    }

    /**
     * Set an attribute of an XML element.
     *
     * @param element The element we want to change.
     * @param name The name of the attribute we want to change or add.
     * @param value The value of the attribute. setAttribute() will
     * automatically convert this into a string in the appropriate way.
     */
    public static void setAttribute(Element element, String name, Color value) {
        setAttribute(element, name, value.getRGB());
    }

Here we are initializing some code needed to make a new XML document.  Java makes you create a factory which creates another factory which creates the actual document.  That seems like a long and convoluted process, even for Java!  This next item is marked private.  It’s all internal do this library.  You don’t have to worry about these details.  You can just ask this library to give you a blank document.  This file is long so your code doesn’t have to be!
    private static final DocumentBuilder documentBuilder;

    static {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder;
        try {
            builder = factory.newDocumentBuilder();
        } catch (ParserConfigurationException ex) {
            // Should we log something?  We don’t expect an error here.
            builder = null;
        }
        documentBuilder = builder;
    }

    /**
     * Create a document containing one element and nothing else. (A document
     * without any elements is not legal or interesting.)
     *
     * @param documentElementName The name of the root element, found with
     * getDocumentElement().
     * @return The newly created document.
     */
    public static Document createEmptyDocument(String documentElementName) {
        Document result = documentBuilder.newDocument();
        result.appendChild(result.createElement(documentElementName));
        return result;
    }

    /**
     * Convert an XML document to a string.
     * @param document The document to convert.  This cannot be null.
     * @return The result.  This will be null if there are any errors.
     */
    public static String toString(Document document) {
        TransformerFactory tf = TransformerFactory.newInstance();
        Transformer transformer;
        try {
            transformer = tf.newTransformer();
            transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, “yes”);
            StringWriter writer = new StringWriter();
            transformer.transform(new DOMSource(document), new StreamResult(writer));
            String output = writer.getBuffer().toString();
            return output;
        } catch (TransformerConfigurationException ex) {
            return null;
        } catch (TransformerException ex) {
            return null;
        }
    }

    // In java 7 see java.nio.charset.StandardCharsets.
    private static final Charset utf8 = Charset.forName(“UTF-8”);

Finally, we have a way to read a document that was given to us by the main program.  The standard Java libraries give you two types of options for reading an XML document.  There are low level routines, where you create one type of stream only to convert it to another type of stream, etc., in a long and convoluted process.  And there are high level routines, like “load this document from an HTTP server and parse it for me.”  But there’s nothing in the middle.

That’s what I created here, the middle layer.  All of those long and complicated stream functions give you options that almost no one ever uses.  They are complicated and you always have to look them up to get your code right.  And the high level routines are only good if you want exactly what they are offering.  Sure, if I wanted to load a message from a normal web server or FTP server, that would be great.  But Java doesn’t know about my proprietary high performance server protocol, so that doesn’t work for me.

This function takes in a string, which is a very common way to store this type of data.  The string could come from anywhere.  A file.  A network connection.  I don’t care.  This is a good stopping point, where one part of the program can hand the data to the next.
    /**
     * Parse a string to create XML.
     *
     * @param body The string to be parsed.
     * @return The top level element of the XML document. null on error.
     */
    public static Element get(String body) {
        return get(body.getBytes(utf8));
    }

}

Let me end with a few more thoughts on “the middle layer.”  That’s a phrase that come to define my personal speciality.  I’ve worked on a lot of teams over the years.  (Long gone are the days when Steve Wozniak could build and program the computer all by himself!)  I do a lot of things.  But the part that I do better than anyone is this middle layer.  There are a wide variety of tools out there, and some are easier to use than others.  There are a lot of people out there who need a specialized product, and a lot of programmers building those products for them.  I often find myself cleaning up the raw pieces of the solution so it’s easier for other programmers to put them together.
I often write about the easiest way to do something.  Now you know the secrets to making these things easy.  
It’s our goal at Trade-Ideas to make it easy for you to get the data you need.  We start by making our own lives easier.