An XML Adventure: Part 2 - Go

Honestly, I probably didn't need to look into Go. Python gave me the answers and results I needed, and I probably should have stopped there. But I felt an inescapable draw to the new(ish) shiny language spawned from the greats at Google.

via GIPHY

What I like about the Go XML package, and Go in general, is how it seems to make the complex simple without sacrificing power and flexibility. The tradeoff being that, in order to get that power and flexibility, we have to write a little more code. In Python, all we had to do was set a root for our tree and watch it grow. In Go, we have to tell the unmarshaller how the tree should grow - in other words - we have to build its structures ourselves.

I'll use the same example XML as I did in Part 1, reproduced below:

<?xml version="1.0"?>  
<data>  
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="United States of America">
        <rank>3</rank>
        <year>2017</year>
        <gdppc>6000000</gdppc>
        <neighbor name="Canada" direction="N"/>
        <neighbor name="Mexico" direction="S"/>
        <states>
            <state name="Texas">
                <capital name="Austin"/>
                <flower name="Bluebonnet"/>
            </state>
            <state name="Virginia">
                <capital name="Richmond"/>
                <flower name="Dogwood"/>
            </state>
            <state name="Florida">
                <capital name="Tallahassee"/>
                <flower name="Orange Blossom"/>
            </state>
        </states>
    </country>
</data> 

We'll use structs to represent the elements of our XML tree. Go's XML package, the unmarshaller specifically, makes use of another Go package called "reflect". In order for the reflect package to access the fields of our structs, we need to make sure our struct fields are "exported" fields. An exported field is a field that can be accessed directly by other packages. Unexported fields, on the other hand, can only be accessed directly from within the same package. We capitalize the first letter of a field to export it, so we will capitalize all of our fields. If we wanted unexported fields, we'd leave the first letter lowercase. Next, we'll guide the unmarshaller through the tree with tags on our fields: xml:"name-of-value" to select a value of an element and xml:"name-of-attribute,attr" to select the value of an attribute.

type CountryData struct {
	Countries []Country `xml:"country"`
}

type Country struct {
	XMLName   xml.Name   `xml:"country"`
	Name      string     `xml:"name,attr"`
	Rank      string     `xml:"rank"`
	Gdppc     string     `xml:"gdppc"`
	Neighbors []Neighbor `xml:"neighbor"`
	Capitals  []Capital  `xml:"states>state>capital"`
}

type Neighbor struct {
	Name      string `xml:"name,attr"`
	Direction string `xml:"direction,attr"`
}

type Capital struct {
	Name string `xml:"name,attr"`
}

Now that we've got our structure, we can unmarshal the XML. Just like with Python, we'll pass the XML file in as an argument at the command line. We'll use Go's "io/ioutil" and "os" packages and just make sure we pass the file in as the second argument.

package main

import (
	"encoding/xml"
	"fmt"
	"io/ioutil"
	"os"
)

func main() {
	xmlFile, err := ioutil.ReadFile(os.Args[1])
	if err != nil {
		panic(err)
	}

	XMLTree := CountryData{}

	err = xml.Unmarshal([]byte(xmlFile), &XMLTree)
	if err != nil {
		fmt.Printf("Error: %v", err)

	}
}

And that's all folks. Once we've built the structs, not much else is needed to unmarshal our XML. We read the XML file, declare a root element, and then unmarshal.

I have to write a lot more code here than I did with Python, but I get to control the design of the tree. Unmarshal takes my XML and assigns the data to the corresponding struct fields. If Unmarshal doesn't find a corresponding struct for some of the data, the data gets discarded. This way of doing things results in more code, but much more flexibility. For example, if I decide I just want information about State capitals without the State information, I have that option. You can see an example of this in my code in the "Country" struct where I use a field tag and the ">" character to instruct Unmarshal to descend from "states" to "state" to "capital".

The thing I like best about using the Go XML package is that I can add additional fields and methods to my structs if I want additional functionality. For example, we could add a gdppc comparison method to the Country struct. The result might look something like this:

type Country struct {
	XMLName   xml.Name   `xml:"country"`
	Name      string     `xml:"name,attr"`
	Rank      string     `xml:"rank"`
	Gdppc     string     `xml:"gdppc"`
	Neighbors []Neighbor `xml:"neighbor"`
	Capitals  []Capital  `xml:"states>state>capital"`
}

func (c Country) gdppcDifference(otherGdppc int) (int, error) {
	thisGdppc, err := strconv.Atoi(c.Gdppc)
	if err != nil {
		return 0, fmt.Errorf("Error converting %v gdppc to int: %v", c.Name, err)
	}
	return thisGdppc - otherGdppc, nil
}

I'm still working out the kinks in my Go code, but I've really enjoyed using it. The flexibility afforded by building my own structs makes me feel like I can build a truly robust system. Where Python was great for quick and simple searches, Go's XML package seems better for building a more complete tool. Cheers!

Fancy Gopher by Renee French

Reference:
Go XML Package, Python argparse, Python XML Module