Last Updated:

How to Update XML Elements and Attribute Values Using Python ElementTree (Step-by-Step Guide with CSV Test Data)

XML (eXtensible Markup Language) is widely used for storing and exchanging structured data, from configuration files to product catalogs. Often, you’ll need to update XML content—whether fix ing element text (e.g., prices, stock levels) or modifying attributes (e.g., categories, timestamps). Python’s xml.etree.ElementTree (ElementTree) library simplifies XML parsing and manipulation with its lightweight, intuitive API.

In this guide, we’ll walk through step-by-step how to update XML elements and attributes using ElementTree. To make it practical, we’ll use CSV test data to bulk-update product information (e.g., prices and stock levels) in an XML catalog. By the end, you’ll be able to automate XML updates using Python, even with large datasets.

Table of Contents#

  1. Prerequisites
  2. Understanding the XML Structure
  3. Preparing CSV Test Data
  4. Step 1: Import Required Libraries
  5. Step 2: Parse the XML File
  6. Step 3: Update Element Text
  7. Step 4: Update Attribute Values
  8. Step 5: Merge CSV Data into XML (Bulk Update)
  9. Step 6: Write Updated XML Back to File
  10. Testing the Solution
  11. Troubleshooting Common Issues
  12. Conclusion
  13. References

Prerequisites#

Before starting, ensure you have:

  • Python 3.x installed (ElementTree is included in Python’s standard library, so no extra installations are needed).
  • Basic knowledge of:
    • XML structure (elements, attributes, nested tags).
    • CSV files (rows, columns).
    • Python fundamentals (loops, dictionaries, file handling).

Understanding the XML Structure#

We’ll work with a sample XML file (products.xml) representing a product catalog. Here’s its structure:

<!-- products.xml -->
<products>
    <product id="101" category="electronics">
        <name>Wireless Headphones</name>
        <price>199.99</price>
        <stock>50</stock>
        <last_updated>2023-01-15</last_updated>
    </product>
    <product id="102" category="clothing">
        <name>Cotton T-Shirt</name>
        <price>19.99</price>
        <stock>200</stock>
        <last_updated>2023-01-15</last_updated>
    </product>
    <product id="103" category="home">
        <name>Blender</name>
        <price>89.99</price>
        <stock>30</stock>
        <last_updated>2023-01-15</last_updated>
    </product>
</products>

Key components to note:

  • Root element: <products>.
  • Child elements: <product>, each with an id attribute (unique identifier) and category attribute.
  • Nested elements: <name>, <price>, <stock>, <last_updated> (these will be updated).

Preparing CSV Test Data#

We’ll use a CSV file (updates.csv) to store the changes we want to apply to the XML. Each row in the CSV specifies a product (via product_id) and new values for new_price and new_stock.

Here’s updates.csv:

product_id,new_price,new_stock
101,179.99,45
102,24.99,180
104,49.99,100
  • Goal: For each product in the CSV (matching product_id to XML’s id attribute), update price and stock elements in products.xml.

Step 1: Import Required Libraries#

We’ll use two built-in Python libraries:

  • xml.etree.ElementTree: To parse, manipulate, and write XML.
  • csv: To read the CSV update data.

Import them at the start of your script:

import xml.etree.ElementTree as ET
import csv
from datetime import datetime  # For updating the 'last_updated' timestamp

Step 2: Parse the XML File#

First, load and parse the XML file into an ElementTree object. This gives us a hierarchical tree structure to work with.

# Parse the XML file
tree = ET.parse('products.xml')
root = tree.getroot()  # Get the root element (<products>)
 
# Verify parsing by printing the root tag
print(f"Root element: {root.tag}")  # Output: Root element: products
  • ET.parse('products.xml'): Loads the XML file into a ElementTree object.
  • root = tree.getroot(): Accesses the root element (<products>), which is the starting point for navigating the XML.

Step 3: Update Element Text#

Elements like <price> or <stock> have text content (e.g., 199.99). To update this, follow these steps:

Example: Update Price for a Specific Product#

Let’s manually update the price of the product with id="103" (Blender) from 89.99 to 79.99.

# Find the product with id="103"
product = root.find(".//product[@id='103']")  # XPath syntax to search for product by id
 
if product is not None:
    # Update the 'price' element's text
    price_element = product.find('price')
    price_element.text = "79.99"  # New price
    print(f"Updated price for product 103 to {price_element.text}")
else:
    print("Product 103 not found.")

Key Notes:#

  • root.find(".//product[@id='103']"): Uses XPath to search for a <product> element with id="103". The // means "search all descendants" of the root.
  • product.find('price'): Finds the first <price> child element of the product.
  • price_element.text = "79.99": Updates the text content of the <price> element.

Step 4: Update Attribute Values#

Attributes (e.g., category in <product id="101" category="electronics">) are stored in a dictionary-like attrib property of elements.

Example: Update Category Attribute#

Let’s update the category of product 102 (T-Shirt) from clothing to apparel.

# Find product with id="102"
product = root.find(".//product[@id='102']")
 
if product is not None:
    # Update the 'category' attribute
    product.attrib['category'] = 'apparel'
    print(f"Updated category for product 102 to {product.attrib['category']}")  # Output: apparel
else:
    print("Product 102 not found.")
  • product.attrib: A dictionary containing all attributes of the product element (e.g., {'id': '102', 'category': 'clothing'}).
  • product.attrib['category'] = 'apparel': Directly modifies the category attribute.

Step 5: Merge CSV Data into XML (Bulk Update)#

Now, let’s automate updates using the updates.csv file. We’ll loop through each CSV row, find the corresponding product in XML, and update price, stock, and last_updated (with today’s date).

Full Code for CSV Merge#

def update_product_from_csv(xml_root, csv_file):
    # Read CSV and update XML
    with open(csv_file, mode='r') as file:
        csv_reader = csv.DictReader(file)  # Read CSV as dictionaries (keys = headers)
        
        for row in csv_reader:
            product_id = row['product_id']
            new_price = row['new_price']
            new_stock = row['new_stock']
            
            # Find product in XML by id
            product = xml_root.find(f".//product[@id='{product_id}']")
            
            if product:
                # Update price element
                price_elem = product.find('price')
                if price_elem is not None:
                    price_elem.text = new_price
                
                # Update stock element
                stock_elem = product.find('stock')
                if stock_elem is not None:
                    stock_elem.text = new_stock
                
                # Update 'last_updated' with today's date (YYYY-MM-DD)
                last_updated_elem = product.find('last_updated')
                if last_updated_elem is not None:
                    last_updated_elem.text = datetime.today().strftime('%Y-%m-%d')
                
                print(f"Updated product {product_id}: Price={new_price}, Stock={new_stock}")
            else:
                print(f"Warning: Product {product_id} not found in XML.")
 
# Run the function with our root and CSV file
update_product_from_csv(root, 'updates.csv')

How It Works:#

  1. Read CSV: csv.DictReader reads the CSV into rows of dictionaries (e.g., row['product_id'] gives the ID).
  2. Find Product: For each row, xml_root.find(f".//product[@id='{product_id}']") searches XML for the product with the matching id.
  3. Update Elements: If found, update price, stock, and last_updated (using datetime for the current date).
  4. Error Handling: Prints a warning if a product in CSV doesn’t exist in XML (e.g., product 104 in our CSV).

Step 6: Write Updated XML Back to File#

After making changes, save the modified XML tree back to a file. Use tree.write() for this:

# Write updated XML to a new file (or overwrite the original)
tree.write(
    'updated_products.xml',
    encoding='utf-8',
    xml_declaration=True  # Includes <?xml version='1.0' encoding='utf-8'?> at the top
)
print("Updated XML saved to 'updated_products.xml'")

Note: Formatting#

ElementTree does not preserve original indentation or whitespace. To pretty-print the XML (for readability), use xml.dom.minidom as a bonus step:

from xml.dom import minidom
 
# Convert ElementTree to a pretty-printed string
rough_string = ET.tostring(root, 'utf-8')
reparsed = minidom.parseString(rough_string)
pretty_xml = reparsed.toprettyxml(indent="  ")
 
# Write the pretty XML to file
with open('updated_products_pretty.xml', 'w') as f:
    f.write(pretty_xml)

Testing the Solution#

Run the full script and check updated_products.xml to verify changes:

Expected Results:#

  • Product 101 (Wireless Headphones):
    • price updated from 199.99 to 179.99.
    • stock updated from 50 to 45.
    • last_updated set to today’s date.
  • Product 102 (Cotton T-Shirt):
    • price updated from 19.99 to 24.99.
    • stock updated from 200 to 180.
    • category updated from clothing to apparel.
  • Product 103 (Blender):
    • price manually updated to 79.99.
  • Product 104: Warning printed ("Product 104 not found in XML").

Troubleshooting Common Issues#

IssueSolution
XML file not foundEnsure products.xml is in the same directory as the script, or use an absolute path (e.g., C:/data/products.xml).
Product not found in XMLCheck that product_id in CSV matches the id attribute in XML (e.g., 101 vs 0101).
Element/attribute update not savingVerify you’re using tree.write() after modifying the tree, and that you’re writing to the correct file path.
XML formatting is messyUse xml.dom.minidom (as shown in Step 6) to pretty-print the output.

Conclusion#

In this guide, you learned how to:

  • Parse XML files with ElementTree.
  • Update element text (e.g., <price>) and attribute values (e.g., category).
  • Bulk-update XML using CSV data (e.g., updating prices/stock for multiple products).
  • Save changes back to an XML file (with optional pretty-printing).

ElementTree is a powerful tool for XML manipulation in Python, and combining it with CSV data makes bulk updates efficient and scalable. Try adapting this workflow to your own XML/CSV datasets!

References#