Lesson 7 - Reading XML via the SAX approach in C# .NET

C# .NET Files and I/O Reading XML via the SAX approach in C# .NET

In the previous lesson, Introduction to XML and writing via the SAX approach, we introduced the XML format and created a simple XML file using the SAX approach. In today's tutorial, we'll continue on with the demo and write the opposite process which is loading the XML file with users and building the appropriate object structure from it (the user list).

For completeness' sake, we'll show what our file.xml XML file looks like at the moment:

<?xml version="1.0" encoding="utf-8"?>
<users>
  <user age="22">
    <name>John Smith</name>
    <registered>3/21/2000</registered>
  </user>
  <user age="31">
    <name>James Brown</name>
    <registered>10/30/2016</registered>
  </user>
  <user age="16">
    <name>Tom Hanks</name>
    <registered>1/12/2011</registered>
  </user>
</users>

Here's what our User.cs class looks like:

class User
{
        public string Name { get; private set; }
        public int Age { get; private set; }
        public DateTime Registered { get; private set; }

        public User(string name, int age, DateTime registered)
        {
                Name = name;
                Age = age;
                Registered = registered;
        }

        public override string ToString()
        {
                return Name;
        }

}

Now, create a new project, once again, it'll be a console application. Name it XmlSaxReading and copy our XML file to its bin/debug folder. We'll also need to add the User class to the project. We'll need to load the users to a collection, so let's create an empty users List. We'll write the code right into the Main() method for the sake of simplicity (you already know how to design it properly in an object-oriented way).

List<User> users = new List<User>();

Reading XML via SAX

The .NET framework provides the XmlReader class for reading XML via SAX. Let's create an instance of it. Just like with the XmlWriter class, we'll use the Create() factory method for it whose parameter is the filename. Don't forget to add using System.Xml. Everything will be in a using block, which will take care of closing the file:

using (XmlReader xr = XmlReader.Create(@"file.xml"))
{
}

Let's prepare the necessary variables for the user properties. We can't assign values directly to the instance since its properties are read-only. Another option would be to allow for it to be modified externally, but if we did we'd lose a part of the encapsulation. We'll initialize the properties with default values which will remain there in case the value isn't written in the XML file. The current element's name needs to be stored somewhere, so we'll declare a string variable element for it. We'll write the code in a using block.

string name = "";
int age = 0;
DateTime registered = DateTime.Now;
string element = "";

Let's start by parsing the file. The XmlReader class reads a file line by line, from top to bottom. We call the Read() method on its instance. It returns the next Node each time it calls it. A node can be an element, an attribute, or an element's text value (we'll mainly focus on Element, Text, and EndElement), another node type could be a comment, which isn't very important for us at the moment. Once the reader reaches the end of the file, the Read() method returns false, otherwise, it returns true. We'll load the document nodes gradually using a while loop:

while (xr.Read())
{
}

There are several useful properties on the XmlReader instance. We'll be using NodeType which is where the type of the current node, on which the reader is located at, is stored. Next, we'll use the Name and Value property in which the name of the current node and its value is stored (if it has any).

We'll mainly focus on two types of nodes: Element and Text. Let's react to them. We'll add in empty condition bodies for now:

// reads the element
if (xr.NodeType == XmlNodeType.Element)
{
}
// reads the element value
else if (xr.NodeType == XmlNodeType.Text)
{
}

Now, let's add code to the first condition. To be perfectly clear, we're reacting to the element reading, we'll need to perform two actions.

The key action will be to save the element name to the element variable. This will enable us to determine which element's text we're reading in the second condition.

Every time we encounter a user element, we load the age attribute using the getAttribute() method whose parameter is the attribute's name. The current element's attribute can be read easily. However, it's not that simple with reading its value. Although there are methods like ReadContentAs­Type(), beware, they implicitly call the Read() method for some reason which messes with the while loop. Reading values like this would work properly in non-nested XML files. I tried to find a workaround for this but the solutions were so awkward that I ended up not using the ReadContentAs...() methods at all. Here's what the first condition looks like:

element = xr.Name; // the name of the current element
if (element == "user")
{
        age = int.Parse(xr.GetAttribute("age"));
}

Let's move on to the next branch, i.e. processing the element's values. We'll use the element variable and add it to a switch. We'll assign the value to the corresponding property according to the value in element:

switch (element)
{
        case "name":
                name = xr.Value;
                break;
        case "registered":
                registered = DateTime.Parse(xr.Value);
                break;
}

We're already very close to finishing. The brighter ones among you surely noticed that we won't be adding users anywhere. We'll do so after we reach the closing user element. Now, let's add one last condition:

// reads the closing element
else if ((xr.NodeType == XmlNodeType.EndElement) && (xr.Name == "user"))
        users.Add(new User(name, age, registered));

We're done :)

For completeness' sake, here's the all of the code needed to load the file:

using (XmlReader xr = XmlReader.Create(@"file.xml"))
{
        string name = "";
        int age = 0;
        DateTime registered = DateTime.Now;
        string element = "";
        while (xr.Read())
        {
                // reads the element
                if (xr.NodeType == XmlNodeType.Element)
                {
                        element = xr.Name; // the name of the current element
                        if (element == "user")
                        {
                                age = int.Parse(xr.GetAttribute("age"));
                        }
                }
                // reads the element value
                else if (xr.NodeType == XmlNodeType.Text)
                {
                        switch (element)
                        {
                                case "name":
                                        name = xr.Value;
                                        break;
                                case "registered":
                                        registered= DateTime.Parse(xr.Value);
                                        break;
                        }
                }
                // reads the closing element
                else if ((xr.NodeType == XmlNodeType.EndElement) && (xr.Name == "user"))
                        users.Add(new User(name, age, registered));
        }
}

We still have to print the users out so we know that we loaded them properly. We'll modify the ToString() method in the User class so that it returns all of the values:

public override string ToString()
{
        return String.Format("{0}, {1}, {2}", Name, Age, Registered.ToShortDateString());
}

Then, we'll simply print the users:

// printing the loaded objects
foreach (User u in users)
{
        Console.WriteLine(u);
}
Console.ReadKey();

The result:

Console application
John Smith, 22, 3/21/2000
James Brown, 31, 10/30/2016
Tom Hanks, 16, 1/12/2011

If you didn't like the loading process much, I'm right there with you. Generating a new XML file via SAX is simple and natural, but loading is an awkward process with SAX. Next time, Working with XML files using the DOM approach in C# .NET, we'll look at DOM, the object-based approach for XML document operations.


 

Download

Downloaded 106x (39.36 kB)

 

 

Article has been written for you by David Capka
Avatar
Do you like this article?
1 votes
The author is a programmer, who likes web technologies and being the lead/chief article writer at ICT.social. He shares his knowledge with the community and is always looking to improve. He believes that anyone can do what they set their mind to.
Unicorn College The author learned IT at the Unicorn College - a prestigious college providing education on IT and economics.
Activities (6)

 

 

Comments

To maintain the quality of discussion, we only allow registered members to comment. Sign in. If you're new, Sign up, it's free.

No one has commented yet - be the first!