Simple XML parsing with Python

There are many ways to read/parse XML with Python.  I found at least 2 methods: DOM and SAX. Document Object Model (DOM) is a cross-language API from W3C for accessing or modifying XML; whereas SAX stands for Simple API for XML.

Most of the time, we don't need to understand the whole XML vocabularies; and most of the time we want to parse simple stuff like:
<root>
    <person name="somebody"></person>
    <person name="otherguy"></person>
</root>

I think the simplest way to go is to use python minidom implementation that looks like this:
from xml.dom import minidom

# parse the xml
theXml = minidom.parse('data.xml')

# iterate through the root
rootList = theXml.getElementsByTagName('root')
for root in rootList:
    # you can get the element name by: root.localName

    # iterate through person
    personList = root.getElementsByTagName('person')
    for person in personList:

        # get the attribute
        nameAttribute = person.attributes["name"]
        print nameAttribute.name
        print nameAttribute.value  

Comments

Popular posts from this blog

World, View and Projection Matrix Internals

GDC 2015 Links

BASH: Reading Text File