Featured resource
2025 Tech Upskilling Playbook
Tech Upskilling Playbook

Build future-ready tech teams and hit key business milestones with seven proven plays from industry leaders.

Check it out
  • Lab
    • Libraries: If you want this lab, consider one of these libraries.
    • Cloud
Google Cloud Platform icon
Labs

Parsing XML Documents with Python

XML is the return format for many APIs. It is likely you will work with it in your career. Python is an excellent language for parsing and writing XML to interact with these APIs and other circumstances you may encounter. In this lab we will use Python's `defusedxml` package to write an XML file, allow the intended user to make use of the XML and make changes, and then parse the XML for the changes. You will need basic Python programming and SQL skills for this lab: - [Certified Associate in Python Programming Certification](https://linuxacademy.com/cp/modules/view/id/470)

Google Cloud Platform icon
Lab platform
Lab Info
Level
Beginner
Last updated
Sep 14, 2025
Duration
1h 0m

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.
Table of Contents
  1. Challenge

    Write an XML File

    create_catalog.py contains a skeleton of what we need for this objective. Run it to see that it returns an error:

    python create_catalog.py
    

    The example code showing how to make this work is shown below:

    import sqlite3
    import xml.etree.ElementTree as ET
    from xml.etree.ElementTree import Element, ElementTree
    
    DB_NAME = "author_contracts.db"
    
    def get_db_data():
        """
        write a code to execute the sql script and return the results
        """
    
        sql_query = """SELECT author, title, genre FROM authors"""
        
        con = sqlite3.connect(DB_NAME)
        cur = con.cursor()
    
        cur.execute(sql_query)
        results = cur.fetchall()
    
        cur.close()
        con.close()
    
        return results
    
    def create_book_entry(author, title, genre):
        """
        create the book entry as defined and return book
        """
    
        book = Element('book')
    
        author_tag = ET.SubElement(book, 'author')
        author_tag.text = author
    
        title_tag = ET.SubElement(book, 'title')
        title_tag.text = title
    
        genre_tag = ET.SubElement(book, 'genre')
        genre_tag.text = genre
    
        ET.SubElement(book, 'isbn')
    
        return book
    
    
    # using the information from get_db_data()
    # write code to create a root and then
    # add each book to it, finally write
    # data to "catalog.xml"
    
    root = Element("catalog")
    
    book_info = get_db_data()
    
    for author, title, genre in book_info:
        book = create_book_entry(author, title, genre)
        root.append(book)
    
    tree = ElementTree(element=root)
    tree.write("catalog.xml", encoding="UTF-8", xml_declaration=True)
    
    # test code
    expected_catalog = b"<?xml version='1.0' encoding='UTF-8'?>\n<catalog><book><author>Thompson, Keith</author><title>Oh Python! My Python!</title><genre>biography</genre><isbn /></book><book><author>Fritts, Larry</author><title>Fun with Django</title><genre>satire</genre><isbn /></book><book><author>Applegate, John</author><title>When Bees Attack! The Horror!</title><genre>horror</genre><isbn /></book><book><author>Brown, James</author><title>Martin Buber's Philosophies</title><genre>guide</genre><isbn /></book><book><author>Smith, Jackson</author><title>The Sun Also Orbits</title><genre>mystery</genre><isbn /></book></catalog>"
    
    try:
        with open("catalog.xml", "rb") as f:
            catalog = f.read()
    except FileNotFoundError:
        catalog = ""
    
    assert catalog == expected_catalog
    

    Run python create_catalog.py.

    Congratulations! The data is in a format the ISBN company can use.

  2. Challenge

    Parse XML Changes

    First install defusedxml:

    pip3 install defusedxml
    

    Then we can work on parse_catalog.py. It contains a skeleton of what we need for this next objective. Run python parse_catalog.py before editing, to see that it returns an error.

    The example code showing how to accomplish this second objective is shown below:

    import sqlite3
    import sys
    import defusedxml.ElementTree as ET
    
    DB_NAME = "author_contracts.db"
    
    def update_db(isbn_data_list):
        """ 
        add code to execute each sql_stmt in the order given
        results from sql_query_3 should be assigned to results
        """
        
        sql_query_1 = ''' ALTER TABLE authors ADD COLUMN isbn CHAR(20); '''
    
        sql_query_2 = "UPDATE authors SET isbn = ? WHERE title = ?;"
    
        sql_query_3 = ''' SELECT isbn FROM authors;'''
    
        con = sqlite3.connect("DB_NAME")
        cur = con.cursor()
    
        con.execute(sql_query_1)
        con.commit()
    
        con.executemany(sql_query_2, isbn_data_list)
        con.commit()
    
        cur.execute(sql_query_3)
        results = cur.fetchall()
    
        cur.close()
        con.close()  
    
    
        # test code
        expected_results = [('000-1-000000-00-1',), ('000-2-000000-00-2',), ('000-3-000000-00-3',), ('000-4-000000-00-4',), ('000-5-000000-00-5',)]
    
        assert results == expected_results
    
    # using 'isbn.xml'
    # loop through "book" in file and append isbn and title as a list object to isbn_data_list
    # send isbn_data_list to function update_db
    
    file_name = "isbn.xml"
    
    try:
        tree = ET.parse(file_name)
    except:
        print("File not found")
        sys.exit(1)
    
    isbn_data_list = []
    for book in tree.findall('book'):
        title = book.findtext('title')
        isbn = book.findtext('isbn')
        isbn_data_list.append([isbn, title])
    
    update_db(isbn_data_list)
    

    Now run python parse_catalog.py.

    Awesome! You have mastered XML file parsing.

About the author

Pluralsight Skills gives leaders confidence they have the skills needed to execute technology strategy. Technology teams can benchmark expertise across roles, speed up release cycles and build reliable, secure products. By leveraging our expert content, skill assessments and one-of-a-kind analytics, keep up with the pace of change, put the right people on the right projects and boost productivity. It's the most effective path to developing tech skills at scale.

Real skill practice before real-world application

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Learn by doing

Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.

Follow your guide

All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.

Turn time into mastery

On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.

Get started with Pluralsight