Friday, September 9, 2011

lxml

One of the best documentation for using lxml in Python is located here:

infohost.nmt.edu/tcc/help/pubs/pylxml/pylxml.pdf

One interesting tidbit:

In the DOM, trees are build out of nodes represented as Node instances. Some nodes are Element instances, representing whole elements. Each Element has an assortment of child nodes of various types: Element nodes for its element children; Attribute nodes for its attributes; and Text nodes for textual content.

The lxml view of an XML document, by contrast, builds a tree of only one node type: the Element.

The text following the element. This is the most unusual departure. In the DOM model, any text
following an element E is associated with the parent of E; in lxml, that text is considered the “tail” of E.

No comments:

Post a Comment