Get up to 80 % extra points for free! More info:

Lesson 8 - Reading XML via the SAX approach in Java

In the previous lesson, Writing XML Files via the SAX Approach in Java, we introduced the XML format and created a simple XML file using the SAX approach. In today's tutorial, we'll continue on with the demo and write the opposite process which is loading the XML file with users and building the appropriate object structure from it (the user list).

For completeness' sake, we'll show what our file.xml XML file looks like at the moment:

<?xml version="1.0" encoding="utf-8"?>
<users>
  <user age="22">
    <name>John Smith</name>
    <registered>3/21/2000</registered>
  </user>
  <user age="31">
    <name>James Brown</name>
    <registered>10/30/2016</registered>
  </user>
  <user age="16">
    <name>Tom Hanks</name>
    <registered>1/12/2011</registered>
  </user>
</users>

Here's what our User.java class looks like:

public class User {

    private String name;
    private int age;
    private LocalDate registered;
    public static DateTimeFormatter dateTimeFormatter = DateTimeFormatter.ofPattern("M/d/yyyy");

    public User(String name, int age, LocalDate registered) {
        this.name = name;
        this.age = age;
        this.registered = registered;
    }

    @Override
    public String toString() {
        return String.format("%s, %d, %s", name, age, dateTimeFormatter.format(registered));
    }

    public String getName() {
        return name;
    }

    public int getAge() {
        return age;
    }

    public LocalDate getRegistered() {
        return registered;
    }

}

Now, let's create a new project, once again, it'll be a console application. We'll name it XmlSaxReading and copy our XML file to the project folder.

We'll inherit the new class from org.xml.sax.helpers.DefaultHandler. This will give us access to methods we'll need later when parsing the file. We'll also add the User class to the project. We want to load the users into a collection, so we'll create an empty ArrayList named users.

private List<User> users = new ArrayList<>();

Constants

Before we move to reading, we'll create a helper class and store there constants with the names of each element in the XML file:

public final class Constants {

    public static final String USERS = "users";

    public static final String USER = "user";

    public static final String AGE = "age";
    public static final String NAME = "name";
    public static final String REGISTERED = "registered";

}

Reading XML via SAX

In the main class we'll create a private parse(String file) method, which will accept the path to the XML file as a parameter:

private void parse(String file) throws SAXException, IOException, ParserConfigurationException {
    // TODO implement the method body
}

In the body of this method, we'll start parsing. Java provides the abstract SAXParser class to read XML via SAX. To obtain an instance of this class, we use a factory provided by the SAXParserFactory.newInstance().newSAXParser() class. We simply call parse() on the parser instance, passing the file we want to parse and the handler that takes care of the parsing as parameters. The method body will look like this:

private void parse(String file) throws SAXException, IOException, ParserConfigurationException {
    // Create a parser instance
    SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
    // Start parsing
    parser.parse(new File(file), this);
    // Print the users to the console
    users.forEach(System.out::println);
}

Let's prepare the necessary variables for the user fields. We can't assign values directly to the instance since it has no setters. Another option would be to add setters, but if we did we'd lose a part of the encapsulation. We'll initialize the fields with default values which will remain there in case the value isn't written in the XML file. Then we'll create variables to indicate that we're processing the age or the registration date:

private String name = "";
private int age = 0;
private LocalDate registered = LocalDate.now();

private boolean processingName = false;
private boolean processingRegistered = false;

Now is the time to override the methods the DefaultHandler class provides us with. We'll override three methods: startElement(), endElement(), and characters():

@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
    // This method is called every time the parser finds a new element
}

@Override
public void endElement(String uri, String localName, String qName) throws SAXException {
    // This method is called every time the parser finds a closing tag
}

@Override
public void characters(char[] ch, int start, int length) throws SAXException {
    // This method is called every time the parser offers us to read the element value
}

startElement()

In the startElement() method, two parameters are of particular interest: qName and attributes. The first parameter contains the name of the element that is currently being processed. The second contains the attributes of this element. To find out which element is currently being processed, we use a simple switch:

public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
    switch (qName) {
        case Constants.USER:
            // We'll get the user's age from the attribute
            age = Integer.parseInt(attributes.getValue(Constants.AGE));
            break;
        case Constants.NAME:
            // To process the name , we need to store an information we're processing it since we'll read the value at a different place
            processingName = true;
            break;
        case Constants.REGISTERED:
            // To process the registration date, we need to store a similar information
            processingRegistered = true;
            break;
    }
}

endElement()

In the endElement() method, which is called when a closing tag is being processed, we'll simply set the corresponding indicator back to false:

public void endElement(String uri, String localName, String qName) throws SAXException {
    switch (qName) {
        case Constants.NAME:
            // If we were processing the name, we'll set this indicator to false
            processingName = false;
            break;
        case Constants.REGISTERED:
            // If we were processing the registration date, we'll set this indicator to false
            processingRegistered = false;
            break;
        case Constants.USER:
            // If we've read all the user data, we can create a new user instance and add it to the collection
            User user = new User(name, age, registered);
            users.add(user);
            break;
    }
}

characters()

The last method we need to override is the characters() method, reading the value between the element tags. We'll use our indicators to find out what we're going to read. So the method will look like this:

public void characters(char[] ch, int start, int length) throws SAXException {
    // We create a new string instance
    String text = new String(ch, start, length);
    if (processingName) { // When processing the name, we simple assign it
        name = text;
    } else if (processingRegistered) { // When processing the registration date, we parse it
        registered = LocalDate.parse(text, User.dateTimeFormatter);
    }
}

If we have a lot of fields to assign to, the characters() method will start to grow out of control. An alternative way of processing can be by using HashMaps, where we create a lambda function to process each field and store it in the HashMap. We use the field name as the key. You can read more about the implementation in lesson on zip files.

And the parsing is done. Finally, we'll add the main() method to create a new instance and start parsing:

public static void main(String[] args) {
    try {
        new XmlSaxReading().parse("file.xml");
    } catch (SAXException | IOException | ParserConfigurationException e) {
        e.printStackTrace();
    }
}

The executed code will result in three names read from the file:

Console application
John Smith, 22, 3/21/2000
James Brown, 31, 10/30/2012
Tom Hanks, 16, 1/1/2011

If you didn't like the reading too much, I'll tell you the truth. While generating a new XML file is very simple and natural using SAX, reading is quite confusing. Next time, in the lesson Working with XML files using the DOM approach in Java, we'll look at DOM, i.e. object-oriented approach for XML documents.


 

Did you have a problem with anything? Download the sample application below and compare it with your project, you will find the error easily.

Download

By downloading the following file, you agree to the license terms

Downloaded 4x (27.33 kB)
Application includes source codes in language Java

 

Previous article
Writing XML Files via the SAX Approach in Java
All articles in this section
Files and I/O in Java
Skip article
(not recommended)
Working with XML files using the DOM approach in Java
Article has been written for you by David Capka Hartinger
Avatar
User rating:
No one has rated this quite yet, be the first one!
The author is a programmer, who likes web technologies and being the lead/chief article writer at ICT.social. He shares his knowledge with the community and is always looking to improve. He believes that anyone can do what they set their mind to.
Unicorn university David learned IT at the Unicorn University - a prestigious college providing education on IT and economics.
Activities