3.3.1 Lists

Lists are normalized slightly more than the rest of the document. They are treated almost like sections in that they are only allowed to contain a minimal set of child node types. In fact, lists can only contain one type of child node: list item. The consequence of this is that any content before the first item in a list will be thrown out. In turn, list items will only contain paragraph nodes. The structure of all list structures will look like the structure in Figure 3.4.

\includegraphics[width=3in]{liststruct}
Figure 3.4: Normalized structure of all lists

This structure allows you to easily traverse a list with code like the following.

# Iterate through the items in the list node
for item in listnode:

    # Iterate through the paragraphs in each item
    for par in item:

        # Print the text content of each paragraph
        print par.textContent

    # Print a blank line to separate each item
    print