diff options
author | cinap_lenrek <cinap_lenrek@localhost> | 2011-05-03 11:25:13 +0000 |
---|---|---|
committer | cinap_lenrek <cinap_lenrek@localhost> | 2011-05-03 11:25:13 +0000 |
commit | 458120dd40db6b4df55a4e96b650e16798ef06a0 (patch) | |
tree | 8f82685be24fef97e715c6f5ca4c68d34d5074ee /sys/src/cmd/python/Doc/lib/xmldom.tex | |
parent | 3a742c699f6806c1145aea5149bf15de15a0afd7 (diff) |
add hg and python
Diffstat (limited to 'sys/src/cmd/python/Doc/lib/xmldom.tex')
-rw-r--r-- | sys/src/cmd/python/Doc/lib/xmldom.tex | 928 |
1 files changed, 928 insertions, 0 deletions
diff --git a/sys/src/cmd/python/Doc/lib/xmldom.tex b/sys/src/cmd/python/Doc/lib/xmldom.tex new file mode 100644 index 000000000..d651bf089 --- /dev/null +++ b/sys/src/cmd/python/Doc/lib/xmldom.tex @@ -0,0 +1,928 @@ +\section{\module{xml.dom} --- + The Document Object Model API} + +\declaremodule{standard}{xml.dom} +\modulesynopsis{Document Object Model API for Python.} +\sectionauthor{Paul Prescod}{paul@prescod.net} +\sectionauthor{Martin v. L\"owis}{martin@v.loewis.de} + +\versionadded{2.0} + +The Document Object Model, or ``DOM,'' is a cross-language API from +the World Wide Web Consortium (W3C) for accessing and modifying XML +documents. A DOM implementation presents an XML document as a tree +structure, or allows client code to build such a structure from +scratch. It then gives access to the structure through a set of +objects which provided well-known interfaces. + +The DOM is extremely useful for random-access applications. SAX only +allows you a view of one bit of the document at a time. If you are +looking at one SAX element, you have no access to another. If you are +looking at a text node, you have no access to a containing element. +When you write a SAX application, you need to keep track of your +program's position in the document somewhere in your own code. SAX +does not do it for you. Also, if you need to look ahead in the XML +document, you are just out of luck. + +Some applications are simply impossible in an event driven model with +no access to a tree. Of course you could build some sort of tree +yourself in SAX events, but the DOM allows you to avoid writing that +code. The DOM is a standard tree representation for XML data. + +%What if your needs are somewhere between SAX and the DOM? Perhaps +%you cannot afford to load the entire tree in memory but you find the +%SAX model somewhat cumbersome and low-level. There is also a module +%called xml.dom.pulldom that allows you to build trees of only the +%parts of a document that you need structured access to. It also has +%features that allow you to find your way around the DOM. +% See http://www.prescod.net/python/pulldom + +The Document Object Model is being defined by the W3C in stages, or +``levels'' in their terminology. The Python mapping of the API is +substantially based on the DOM Level~2 recommendation. The mapping of +the Level~3 specification, currently only available in draft form, is +being developed by the \ulink{Python XML Special Interest +Group}{http://www.python.org/sigs/xml-sig/} as part of the +\ulink{PyXML package}{http://pyxml.sourceforge.net/}. Refer to the +documentation bundled with that package for information on the current +state of DOM Level~3 support. + +DOM applications typically start by parsing some XML into a DOM. How +this is accomplished is not covered at all by DOM Level~1, and Level~2 +provides only limited improvements: There is a +\class{DOMImplementation} object class which provides access to +\class{Document} creation methods, but no way to access an XML +reader/parser/Document builder in an implementation-independent way. +There is also no well-defined way to access these methods without an +existing \class{Document} object. In Python, each DOM implementation +will provide a function \function{getDOMImplementation()}. DOM Level~3 +adds a Load/Store specification, which defines an interface to the +reader, but this is not yet available in the Python standard library. + +Once you have a DOM document object, you can access the parts of your +XML document through its properties and methods. These properties are +defined in the DOM specification; this portion of the reference manual +describes the interpretation of the specification in Python. + +The specification provided by the W3C defines the DOM API for Java, +ECMAScript, and OMG IDL. The Python mapping defined here is based in +large part on the IDL version of the specification, but strict +compliance is not required (though implementations are free to support +the strict mapping from IDL). See section \ref{dom-conformance}, +``Conformance,'' for a detailed discussion of mapping requirements. + + +\begin{seealso} + \seetitle[http://www.w3.org/TR/DOM-Level-2-Core/]{Document Object + Model (DOM) Level~2 Specification} + {The W3C recommendation upon which the Python DOM API is + based.} + \seetitle[http://www.w3.org/TR/REC-DOM-Level-1/]{Document Object + Model (DOM) Level~1 Specification} + {The W3C recommendation for the + DOM supported by \module{xml.dom.minidom}.} + \seetitle[http://pyxml.sourceforge.net]{PyXML}{Users that require a + full-featured implementation of DOM should use the PyXML + package.} + \seetitle[http://www.omg.org/docs/formal/02-11-05.pdf]{Python + Language Mapping Specification} + {This specifies the mapping from OMG IDL to Python.} +\end{seealso} + +\subsection{Module Contents} + +The \module{xml.dom} contains the following functions: + +\begin{funcdesc}{registerDOMImplementation}{name, factory} +Register the \var{factory} function with the name \var{name}. The +factory function should return an object which implements the +\class{DOMImplementation} interface. The factory function can return +the same object every time, or a new one for each call, as appropriate +for the specific implementation (e.g. if that implementation supports +some customization). +\end{funcdesc} + +\begin{funcdesc}{getDOMImplementation}{\optional{name\optional{, features}}} +Return a suitable DOM implementation. The \var{name} is either +well-known, the module name of a DOM implementation, or +\code{None}. If it is not \code{None}, imports the corresponding +module and returns a \class{DOMImplementation} object if the import +succeeds. If no name is given, and if the environment variable +\envvar{PYTHON_DOM} is set, this variable is used to find the +implementation. + +If name is not given, this examines the available implementations to +find one with the required feature set. If no implementation can be +found, raise an \exception{ImportError}. The features list must be a +sequence of \code{(\var{feature}, \var{version})} pairs which are +passed to the \method{hasFeature()} method on available +\class{DOMImplementation} objects. +\end{funcdesc} + + +Some convenience constants are also provided: + +\begin{datadesc}{EMPTY_NAMESPACE} + The value used to indicate that no namespace is associated with a + node in the DOM. This is typically found as the + \member{namespaceURI} of a node, or used as the \var{namespaceURI} + parameter to a namespaces-specific method. + \versionadded{2.2} +\end{datadesc} + +\begin{datadesc}{XML_NAMESPACE} + The namespace URI associated with the reserved prefix \code{xml}, as + defined by + \citetitle[http://www.w3.org/TR/REC-xml-names/]{Namespaces in XML} + (section~4). + \versionadded{2.2} +\end{datadesc} + +\begin{datadesc}{XMLNS_NAMESPACE} + The namespace URI for namespace declarations, as defined by + \citetitle[http://www.w3.org/TR/DOM-Level-2-Core/core.html]{Document + Object Model (DOM) Level~2 Core Specification} (section~1.1.8). + \versionadded{2.2} +\end{datadesc} + +\begin{datadesc}{XHTML_NAMESPACE} + The URI of the XHTML namespace as defined by + \citetitle[http://www.w3.org/TR/xhtml1/]{XHTML 1.0: The Extensible + HyperText Markup Language} (section~3.1.1). + \versionadded{2.2} +\end{datadesc} + + +% Should the Node documentation go here? + +In addition, \module{xml.dom} contains a base \class{Node} class and +the DOM exception classes. The \class{Node} class provided by this +module does not implement any of the methods or attributes defined by +the DOM specification; concrete DOM implementations must provide +those. The \class{Node} class provided as part of this module does +provide the constants used for the \member{nodeType} attribute on +concrete \class{Node} objects; they are located within the class +rather than at the module level to conform with the DOM +specifications. + + +\subsection{Objects in the DOM \label{dom-objects}} + +The definitive documentation for the DOM is the DOM specification from +the W3C. + +Note that DOM attributes may also be manipulated as nodes instead of +as simple strings. It is fairly rare that you must do this, however, +so this usage is not yet documented. + + +\begin{tableiii}{l|l|l}{class}{Interface}{Section}{Purpose} + \lineiii{DOMImplementation}{\ref{dom-implementation-objects}} + {Interface to the underlying implementation.} + \lineiii{Node}{\ref{dom-node-objects}} + {Base interface for most objects in a document.} + \lineiii{NodeList}{\ref{dom-nodelist-objects}} + {Interface for a sequence of nodes.} + \lineiii{DocumentType}{\ref{dom-documenttype-objects}} + {Information about the declarations needed to process a document.} + \lineiii{Document}{\ref{dom-document-objects}} + {Object which represents an entire document.} + \lineiii{Element}{\ref{dom-element-objects}} + {Element nodes in the document hierarchy.} + \lineiii{Attr}{\ref{dom-attr-objects}} + {Attribute value nodes on element nodes.} + \lineiii{Comment}{\ref{dom-comment-objects}} + {Representation of comments in the source document.} + \lineiii{Text}{\ref{dom-text-objects}} + {Nodes containing textual content from the document.} + \lineiii{ProcessingInstruction}{\ref{dom-pi-objects}} + {Processing instruction representation.} +\end{tableiii} + +An additional section describes the exceptions defined for working +with the DOM in Python. + + +\subsubsection{DOMImplementation Objects + \label{dom-implementation-objects}} + +The \class{DOMImplementation} interface provides a way for +applications to determine the availability of particular features in +the DOM they are using. DOM Level~2 added the ability to create new +\class{Document} and \class{DocumentType} objects using the +\class{DOMImplementation} as well. + +\begin{methoddesc}[DOMImplementation]{hasFeature}{feature, version} +Return true if the feature identified by the pair of strings +\var{feature} and \var{version} is implemented. +\end{methoddesc} + +\begin{methoddesc}[DOMImplementation]{createDocument}{namespaceUri, qualifiedName, doctype} +Return a new \class{Document} object (the root of the DOM), with a +child \class{Element} object having the given \var{namespaceUri} and +\var{qualifiedName}. The \var{doctype} must be a \class{DocumentType} +object created by \method{createDocumentType()}, or \code{None}. +In the Python DOM API, the first two arguments can also be \code{None} +in order to indicate that no \class{Element} child is to be created. +\end{methoddesc} + +\begin{methoddesc}[DOMImplementation]{createDocumentType}{qualifiedName, publicId, systemId} +Return a new \class{DocumentType} object that encapsulates the given +\var{qualifiedName}, \var{publicId}, and \var{systemId} strings, +representing the information contained in an XML document type +declaration. +\end{methoddesc} + + +\subsubsection{Node Objects \label{dom-node-objects}} + +All of the components of an XML document are subclasses of +\class{Node}. + +\begin{memberdesc}[Node]{nodeType} +An integer representing the node type. Symbolic constants for the +types are on the \class{Node} object: +\constant{ELEMENT_NODE}, \constant{ATTRIBUTE_NODE}, +\constant{TEXT_NODE}, \constant{CDATA_SECTION_NODE}, +\constant{ENTITY_NODE}, \constant{PROCESSING_INSTRUCTION_NODE}, +\constant{COMMENT_NODE}, \constant{DOCUMENT_NODE}, +\constant{DOCUMENT_TYPE_NODE}, \constant{NOTATION_NODE}. +This is a read-only attribute. +\end{memberdesc} + +\begin{memberdesc}[Node]{parentNode} +The parent of the current node, or \code{None} for the document node. +The value is always a \class{Node} object or \code{None}. For +\class{Element} nodes, this will be the parent element, except for the +root element, in which case it will be the \class{Document} object. +For \class{Attr} nodes, this is always \code{None}. +This is a read-only attribute. +\end{memberdesc} + +\begin{memberdesc}[Node]{attributes} +A \class{NamedNodeMap} of attribute objects. Only elements have +actual values for this; others provide \code{None} for this attribute. +This is a read-only attribute. +\end{memberdesc} + +\begin{memberdesc}[Node]{previousSibling} +The node that immediately precedes this one with the same parent. For +instance the element with an end-tag that comes just before the +\var{self} element's start-tag. Of course, XML documents are made +up of more than just elements so the previous sibling could be text, a +comment, or something else. If this node is the first child of the +parent, this attribute will be \code{None}. +This is a read-only attribute. +\end{memberdesc} + +\begin{memberdesc}[Node]{nextSibling} +The node that immediately follows this one with the same parent. See +also \member{previousSibling}. If this is the last child of the +parent, this attribute will be \code{None}. +This is a read-only attribute. +\end{memberdesc} + +\begin{memberdesc}[Node]{childNodes} +A list of nodes contained within this node. +This is a read-only attribute. +\end{memberdesc} + +\begin{memberdesc}[Node]{firstChild} +The first child of the node, if there are any, or \code{None}. +This is a read-only attribute. +\end{memberdesc} + +\begin{memberdesc}[Node]{lastChild} +The last child of the node, if there are any, or \code{None}. +This is a read-only attribute. +\end{memberdesc} + +\begin{memberdesc}[Node]{localName} +The part of the \member{tagName} following the colon if there is one, +else the entire \member{tagName}. The value is a string. +\end{memberdesc} + +\begin{memberdesc}[Node]{prefix} +The part of the \member{tagName} preceding the colon if there is one, +else the empty string. The value is a string, or \code{None} +\end{memberdesc} + +\begin{memberdesc}[Node]{namespaceURI} +The namespace associated with the element name. This will be a +string or \code{None}. This is a read-only attribute. +\end{memberdesc} + +\begin{memberdesc}[Node]{nodeName} +This has a different meaning for each node type; see the DOM +specification for details. You can always get the information you +would get here from another property such as the \member{tagName} +property for elements or the \member{name} property for attributes. +For all node types, the value of this attribute will be either a +string or \code{None}. This is a read-only attribute. +\end{memberdesc} + +\begin{memberdesc}[Node]{nodeValue} +This has a different meaning for each node type; see the DOM +specification for details. The situation is similar to that with +\member{nodeName}. The value is a string or \code{None}. +\end{memberdesc} + +\begin{methoddesc}[Node]{hasAttributes}{} +Returns true if the node has any attributes. +\end{methoddesc} + +\begin{methoddesc}[Node]{hasChildNodes}{} +Returns true if the node has any child nodes. +\end{methoddesc} + +\begin{methoddesc}[Node]{isSameNode}{other} +Returns true if \var{other} refers to the same node as this node. +This is especially useful for DOM implementations which use any sort +of proxy architecture (because more than one object can refer to the +same node). + +\begin{notice} + This is based on a proposed DOM Level~3 API which is still in the + ``working draft'' stage, but this particular interface appears + uncontroversial. Changes from the W3C will not necessarily affect + this method in the Python DOM interface (though any new W3C API for + this would also be supported). +\end{notice} +\end{methoddesc} + +\begin{methoddesc}[Node]{appendChild}{newChild} +Add a new child node to this node at the end of the list of children, +returning \var{newChild}. +\end{methoddesc} + +\begin{methoddesc}[Node]{insertBefore}{newChild, refChild} +Insert a new child node before an existing child. It must be the case +that \var{refChild} is a child of this node; if not, +\exception{ValueError} is raised. \var{newChild} is returned. If +\var{refChild} is \code{None}, it inserts \var{newChild} at the end of +the children's list. +\end{methoddesc} + +\begin{methoddesc}[Node]{removeChild}{oldChild} +Remove a child node. \var{oldChild} must be a child of this node; if +not, \exception{ValueError} is raised. \var{oldChild} is returned on +success. If \var{oldChild} will not be used further, its +\method{unlink()} method should be called. +\end{methoddesc} + +\begin{methoddesc}[Node]{replaceChild}{newChild, oldChild} +Replace an existing node with a new node. It must be the case that +\var{oldChild} is a child of this node; if not, +\exception{ValueError} is raised. +\end{methoddesc} + +\begin{methoddesc}[Node]{normalize}{} +Join adjacent text nodes so that all stretches of text are stored as +single \class{Text} instances. This simplifies processing text from a +DOM tree for many applications. +\versionadded{2.1} +\end{methoddesc} + +\begin{methoddesc}[Node]{cloneNode}{deep} +Clone this node. Setting \var{deep} means to clone all child nodes as +well. This returns the clone. +\end{methoddesc} + + +\subsubsection{NodeList Objects \label{dom-nodelist-objects}} + +A \class{NodeList} represents a sequence of nodes. These objects are +used in two ways in the DOM Core recommendation: the +\class{Element} objects provides one as its list of child nodes, and +the \method{getElementsByTagName()} and +\method{getElementsByTagNameNS()} methods of \class{Node} return +objects with this interface to represent query results. + +The DOM Level~2 recommendation defines one method and one attribute +for these objects: + +\begin{methoddesc}[NodeList]{item}{i} + Return the \var{i}'th item from the sequence, if there is one, or + \code{None}. The index \var{i} is not allowed to be less then zero + or greater than or equal to the length of the sequence. +\end{methoddesc} + +\begin{memberdesc}[NodeList]{length} + The number of nodes in the sequence. +\end{memberdesc} + +In addition, the Python DOM interface requires that some additional +support is provided to allow \class{NodeList} objects to be used as +Python sequences. All \class{NodeList} implementations must include +support for \method{__len__()} and \method{__getitem__()}; this allows +iteration over the \class{NodeList} in \keyword{for} statements and +proper support for the \function{len()} built-in function. + +If a DOM implementation supports modification of the document, the +\class{NodeList} implementation must also support the +\method{__setitem__()} and \method{__delitem__()} methods. + + +\subsubsection{DocumentType Objects \label{dom-documenttype-objects}} + +Information about the notations and entities declared by a document +(including the external subset if the parser uses it and can provide +the information) is available from a \class{DocumentType} object. The +\class{DocumentType} for a document is available from the +\class{Document} object's \member{doctype} attribute; if there is no +\code{DOCTYPE} declaration for the document, the document's +\member{doctype} attribute will be set to \code{None} instead of an +instance of this interface. + +\class{DocumentType} is a specialization of \class{Node}, and adds the +following attributes: + +\begin{memberdesc}[DocumentType]{publicId} + The public identifier for the external subset of the document type + definition. This will be a string or \code{None}. +\end{memberdesc} + +\begin{memberdesc}[DocumentType]{systemId} + The system identifier for the external subset of the document type + definition. This will be a URI as a string, or \code{None}. +\end{memberdesc} + +\begin{memberdesc}[DocumentType]{internalSubset} + A string giving the complete internal subset from the document. + This does not include the brackets which enclose the subset. If the + document has no internal subset, this should be \code{None}. +\end{memberdesc} + +\begin{memberdesc}[DocumentType]{name} + The name of the root element as given in the \code{DOCTYPE} + declaration, if present. +\end{memberdesc} + +\begin{memberdesc}[DocumentType]{entities} + This is a \class{NamedNodeMap} giving the definitions of external + entities. For entity names defined more than once, only the first + definition is provided (others are ignored as required by the XML + recommendation). This may be \code{None} if the information is not + provided by the parser, or if no entities are defined. +\end{memberdesc} + +\begin{memberdesc}[DocumentType]{notations} + This is a \class{NamedNodeMap} giving the definitions of notations. + For notation names defined more than once, only the first definition + is provided (others are ignored as required by the XML + recommendation). This may be \code{None} if the information is not + provided by the parser, or if no notations are defined. +\end{memberdesc} + + +\subsubsection{Document Objects \label{dom-document-objects}} + +A \class{Document} represents an entire XML document, including its +constituent elements, attributes, processing instructions, comments +etc. Remeber that it inherits properties from \class{Node}. + +\begin{memberdesc}[Document]{documentElement} +The one and only root element of the document. +\end{memberdesc} + +\begin{methoddesc}[Document]{createElement}{tagName} +Create and return a new element node. The element is not inserted +into the document when it is created. You need to explicitly insert +it with one of the other methods such as \method{insertBefore()} or +\method{appendChild()}. +\end{methoddesc} + +\begin{methoddesc}[Document]{createElementNS}{namespaceURI, tagName} +Create and return a new element with a namespace. The +\var{tagName} may have a prefix. The element is not inserted into the +document when it is created. You need to explicitly insert it with +one of the other methods such as \method{insertBefore()} or +\method{appendChild()}. +\end{methoddesc} + +\begin{methoddesc}[Document]{createTextNode}{data} +Create and return a text node containing the data passed as a +parameter. As with the other creation methods, this one does not +insert the node into the tree. +\end{methoddesc} + +\begin{methoddesc}[Document]{createComment}{data} +Create and return a comment node containing the data passed as a +parameter. As with the other creation methods, this one does not +insert the node into the tree. +\end{methoddesc} + +\begin{methoddesc}[Document]{createProcessingInstruction}{target, data} +Create and return a processing instruction node containing the +\var{target} and \var{data} passed as parameters. As with the other +creation methods, this one does not insert the node into the tree. +\end{methoddesc} + +\begin{methoddesc}[Document]{createAttribute}{name} +Create and return an attribute node. This method does not associate +the attribute node with any particular element. You must use +\method{setAttributeNode()} on the appropriate \class{Element} object +to use the newly created attribute instance. +\end{methoddesc} + +\begin{methoddesc}[Document]{createAttributeNS}{namespaceURI, qualifiedName} +Create and return an attribute node with a namespace. The +\var{tagName} may have a prefix. This method does not associate the +attribute node with any particular element. You must use +\method{setAttributeNode()} on the appropriate \class{Element} object +to use the newly created attribute instance. +\end{methoddesc} + +\begin{methoddesc}[Document]{getElementsByTagName}{tagName} +Search for all descendants (direct children, children's children, +etc.) with a particular element type name. +\end{methoddesc} + +\begin{methoddesc}[Document]{getElementsByTagNameNS}{namespaceURI, localName} +Search for all descendants (direct children, children's children, +etc.) with a particular namespace URI and localname. The localname is +the part of the namespace after the prefix. +\end{methoddesc} + + +\subsubsection{Element Objects \label{dom-element-objects}} + +\class{Element} is a subclass of \class{Node}, so inherits all the +attributes of that class. + +\begin{memberdesc}[Element]{tagName} +The element type name. In a namespace-using document it may have +colons in it. The value is a string. +\end{memberdesc} + +\begin{methoddesc}[Element]{getElementsByTagName}{tagName} +Same as equivalent method in the \class{Document} class. +\end{methoddesc} + +\begin{methoddesc}[Element]{getElementsByTagNameNS}{tagName} +Same as equivalent method in the \class{Document} class. +\end{methoddesc} + +\begin{methoddesc}[Element]{hasAttribute}{name} +Returns true if the element has an attribute named by \var{name}. +\end{methoddesc} + +\begin{methoddesc}[Element]{hasAttributeNS}{namespaceURI, localName} +Returns true if the element has an attribute named by +\var{namespaceURI} and \var{localName}. +\end{methoddesc} + +\begin{methoddesc}[Element]{getAttribute}{name} +Return the value of the attribute named by \var{name} as a +string. If no such attribute exists, an empty string is returned, +as if the attribute had no value. +\end{methoddesc} + +\begin{methoddesc}[Element]{getAttributeNode}{attrname} +Return the \class{Attr} node for the attribute named by +\var{attrname}. +\end{methoddesc} + +\begin{methoddesc}[Element]{getAttributeNS}{namespaceURI, localName} +Return the value of the attribute named by \var{namespaceURI} and +\var{localName} as a string. If no such attribute exists, an empty +string is returned, as if the attribute had no value. +\end{methoddesc} + +\begin{methoddesc}[Element]{getAttributeNodeNS}{namespaceURI, localName} +Return an attribute value as a node, given a \var{namespaceURI} and +\var{localName}. +\end{methoddesc} + +\begin{methoddesc}[Element]{removeAttribute}{name} +Remove an attribute by name. No exception is raised if there is no +matching attribute. +\end{methoddesc} + +\begin{methoddesc}[Element]{removeAttributeNode}{oldAttr} +Remove and return \var{oldAttr} from the attribute list, if present. +If \var{oldAttr} is not present, \exception{NotFoundErr} is raised. +\end{methoddesc} + +\begin{methoddesc}[Element]{removeAttributeNS}{namespaceURI, localName} +Remove an attribute by name. Note that it uses a localName, not a +qname. No exception is raised if there is no matching attribute. +\end{methoddesc} + +\begin{methoddesc}[Element]{setAttribute}{name, value} +Set an attribute value from a string. +\end{methoddesc} + +\begin{methoddesc}[Element]{setAttributeNode}{newAttr} +Add a new attribute node to the element, replacing an existing +attribute if necessary if the \member{name} attribute matches. If a +replacement occurs, the old attribute node will be returned. If +\var{newAttr} is already in use, \exception{InuseAttributeErr} will be +raised. +\end{methoddesc} + +\begin{methoddesc}[Element]{setAttributeNodeNS}{newAttr} +Add a new attribute node to the element, replacing an existing +attribute if necessary if the \member{namespaceURI} and +\member{localName} attributes match. If a replacement occurs, the old +attribute node will be returned. If \var{newAttr} is already in use, +\exception{InuseAttributeErr} will be raised. +\end{methoddesc} + +\begin{methoddesc}[Element]{setAttributeNS}{namespaceURI, qname, value} +Set an attribute value from a string, given a \var{namespaceURI} and a +\var{qname}. Note that a qname is the whole attribute name. This is +different than above. +\end{methoddesc} + + +\subsubsection{Attr Objects \label{dom-attr-objects}} + +\class{Attr} inherits from \class{Node}, so inherits all its +attributes. + +\begin{memberdesc}[Attr]{name} +The attribute name. In a namespace-using document it may have colons +in it. +\end{memberdesc} + +\begin{memberdesc}[Attr]{localName} +The part of the name following the colon if there is one, else the +entire name. This is a read-only attribute. +\end{memberdesc} + +\begin{memberdesc}[Attr]{prefix} +The part of the name preceding the colon if there is one, else the +empty string. +\end{memberdesc} + + +\subsubsection{NamedNodeMap Objects \label{dom-attributelist-objects}} + +\class{NamedNodeMap} does \emph{not} inherit from \class{Node}. + +\begin{memberdesc}[NamedNodeMap]{length} +The length of the attribute list. +\end{memberdesc} + +\begin{methoddesc}[NamedNodeMap]{item}{index} +Return an attribute with a particular index. The order you get the +attributes in is arbitrary but will be consistent for the life of a +DOM. Each item is an attribute node. Get its value with the +\member{value} attribute. +\end{methoddesc} + +There are also experimental methods that give this class more mapping +behavior. You can use them or you can use the standardized +\method{getAttribute*()} family of methods on the \class{Element} +objects. + + +\subsubsection{Comment Objects \label{dom-comment-objects}} + +\class{Comment} represents a comment in the XML document. It is a +subclass of \class{Node}, but cannot have child nodes. + +\begin{memberdesc}[Comment]{data} +The content of the comment as a string. The attribute contains all +characters between the leading \code{<!-}\code{-} and trailing +\code{-}\code{->}, but does not include them. +\end{memberdesc} + + +\subsubsection{Text and CDATASection Objects \label{dom-text-objects}} + +The \class{Text} interface represents text in the XML document. If +the parser and DOM implementation support the DOM's XML extension, +portions of the text enclosed in CDATA marked sections are stored in +\class{CDATASection} objects. These two interfaces are identical, but +provide different values for the \member{nodeType} attribute. + +These interfaces extend the \class{Node} interface. They cannot have +child nodes. + +\begin{memberdesc}[Text]{data} +The content of the text node as a string. +\end{memberdesc} + +\begin{notice} + The use of a \class{CDATASection} node does not indicate that the + node represents a complete CDATA marked section, only that the + content of the node was part of a CDATA section. A single CDATA + section may be represented by more than one node in the document + tree. There is no way to determine whether two adjacent + \class{CDATASection} nodes represent different CDATA marked + sections. +\end{notice} + + +\subsubsection{ProcessingInstruction Objects \label{dom-pi-objects}} + +Represents a processing instruction in the XML document; this inherits +from the \class{Node} interface and cannot have child nodes. + +\begin{memberdesc}[ProcessingInstruction]{target} +The content of the processing instruction up to the first whitespace +character. This is a read-only attribute. +\end{memberdesc} + +\begin{memberdesc}[ProcessingInstruction]{data} +The content of the processing instruction following the first +whitespace character. +\end{memberdesc} + + +\subsubsection{Exceptions \label{dom-exceptions}} + +\versionadded{2.1} + +The DOM Level~2 recommendation defines a single exception, +\exception{DOMException}, and a number of constants that allow +applications to determine what sort of error occurred. +\exception{DOMException} instances carry a \member{code} attribute +that provides the appropriate value for the specific exception. + +The Python DOM interface provides the constants, but also expands the +set of exceptions so that a specific exception exists for each of the +exception codes defined by the DOM. The implementations must raise +the appropriate specific exception, each of which carries the +appropriate value for the \member{code} attribute. + +\begin{excdesc}{DOMException} + Base exception class used for all specific DOM exceptions. This + exception class cannot be directly instantiated. +\end{excdesc} + +\begin{excdesc}{DomstringSizeErr} + Raised when a specified range of text does not fit into a string. + This is not known to be used in the Python DOM implementations, but + may be received from DOM implementations not written in Python. +\end{excdesc} + +\begin{excdesc}{HierarchyRequestErr} + Raised when an attempt is made to insert a node where the node type + is not allowed. +\end{excdesc} + +\begin{excdesc}{IndexSizeErr} + Raised when an index or size parameter to a method is negative or + exceeds the allowed values. +\end{excdesc} + +\begin{excdesc}{InuseAttributeErr} + Raised when an attempt is made to insert an \class{Attr} node that + is already present elsewhere in the document. +\end{excdesc} + +\begin{excdesc}{InvalidAccessErr} + Raised if a parameter or an operation is not supported on the + underlying object. +\end{excdesc} + +\begin{excdesc}{InvalidCharacterErr} + This exception is raised when a string parameter contains a + character that is not permitted in the context it's being used in by + the XML 1.0 recommendation. For example, attempting to create an + \class{Element} node with a space in the element type name will + cause this error to be raised. +\end{excdesc} + +\begin{excdesc}{InvalidModificationErr} + Raised when an attempt is made to modify the type of a node. +\end{excdesc} + +\begin{excdesc}{InvalidStateErr} + Raised when an attempt is made to use an object that is not defined or is no + longer usable. +\end{excdesc} + +\begin{excdesc}{NamespaceErr} + If an attempt is made to change any object in a way that is not + permitted with regard to the + \citetitle[http://www.w3.org/TR/REC-xml-names/]{Namespaces in XML} + recommendation, this exception is raised. +\end{excdesc} + +\begin{excdesc}{NotFoundErr} + Exception when a node does not exist in the referenced context. For + example, \method{NamedNodeMap.removeNamedItem()} will raise this if + the node passed in does not exist in the map. +\end{excdesc} + +\begin{excdesc}{NotSupportedErr} + Raised when the implementation does not support the requested type + of object or operation. +\end{excdesc} + +\begin{excdesc}{NoDataAllowedErr} + This is raised if data is specified for a node which does not + support data. + % XXX a better explanation is needed! +\end{excdesc} + +\begin{excdesc}{NoModificationAllowedErr} + Raised on attempts to modify an object where modifications are not + allowed (such as for read-only nodes). +\end{excdesc} + +\begin{excdesc}{SyntaxErr} + Raised when an invalid or illegal string is specified. + % XXX how is this different from InvalidCharacterErr ??? +\end{excdesc} + +\begin{excdesc}{WrongDocumentErr} + Raised when a node is inserted in a different document than it + currently belongs to, and the implementation does not support + migrating the node from one document to the other. +\end{excdesc} + +The exception codes defined in the DOM recommendation map to the +exceptions described above according to this table: + +\begin{tableii}{l|l}{constant}{Constant}{Exception} + \lineii{DOMSTRING_SIZE_ERR}{\exception{DomstringSizeErr}} + \lineii{HIERARCHY_REQUEST_ERR}{\exception{HierarchyRequestErr}} + \lineii{INDEX_SIZE_ERR}{\exception{IndexSizeErr}} + \lineii{INUSE_ATTRIBUTE_ERR}{\exception{InuseAttributeErr}} + \lineii{INVALID_ACCESS_ERR}{\exception{InvalidAccessErr}} + \lineii{INVALID_CHARACTER_ERR}{\exception{InvalidCharacterErr}} + \lineii{INVALID_MODIFICATION_ERR}{\exception{InvalidModificationErr}} + \lineii{INVALID_STATE_ERR}{\exception{InvalidStateErr}} + \lineii{NAMESPACE_ERR}{\exception{NamespaceErr}} + \lineii{NOT_FOUND_ERR}{\exception{NotFoundErr}} + \lineii{NOT_SUPPORTED_ERR}{\exception{NotSupportedErr}} + \lineii{NO_DATA_ALLOWED_ERR}{\exception{NoDataAllowedErr}} + \lineii{NO_MODIFICATION_ALLOWED_ERR}{\exception{NoModificationAllowedErr}} + \lineii{SYNTAX_ERR}{\exception{SyntaxErr}} + \lineii{WRONG_DOCUMENT_ERR}{\exception{WrongDocumentErr}} +\end{tableii} + + +\subsection{Conformance \label{dom-conformance}} + +This section describes the conformance requirements and relationships +between the Python DOM API, the W3C DOM recommendations, and the OMG +IDL mapping for Python. + + +\subsubsection{Type Mapping \label{dom-type-mapping}} + +The primitive IDL types used in the DOM specification are mapped to +Python types according to the following table. + +\begin{tableii}{l|l}{code}{IDL Type}{Python Type} + \lineii{boolean}{\code{IntegerType} (with a value of \code{0} or \code{1})} + \lineii{int}{\code{IntegerType}} + \lineii{long int}{\code{IntegerType}} + \lineii{unsigned int}{\code{IntegerType}} +\end{tableii} + +Additionally, the \class{DOMString} defined in the recommendation is +mapped to a Python string or Unicode string. Applications should +be able to handle Unicode whenever a string is returned from the DOM. + +The IDL \keyword{null} value is mapped to \code{None}, which may be +accepted or provided by the implementation whenever \keyword{null} is +allowed by the API. + + +\subsubsection{Accessor Methods \label{dom-accessor-methods}} + +The mapping from OMG IDL to Python defines accessor functions for IDL +\keyword{attribute} declarations in much the way the Java mapping +does. Mapping the IDL declarations + +\begin{verbatim} +readonly attribute string someValue; + attribute string anotherValue; +\end{verbatim} + +yields three accessor functions: a ``get'' method for +\member{someValue} (\method{_get_someValue()}), and ``get'' and +``set'' methods for +\member{anotherValue} (\method{_get_anotherValue()} and +\method{_set_anotherValue()}). The mapping, in particular, does not +require that the IDL attributes are accessible as normal Python +attributes: \code{\var{object}.someValue} is \emph{not} required to +work, and may raise an \exception{AttributeError}. + +The Python DOM API, however, \emph{does} require that normal attribute +access work. This means that the typical surrogates generated by +Python IDL compilers are not likely to work, and wrapper objects may +be needed on the client if the DOM objects are accessed via CORBA. +While this does require some additional consideration for CORBA DOM +clients, the implementers with experience using DOM over CORBA from +Python do not consider this a problem. Attributes that are declared +\keyword{readonly} may not restrict write access in all DOM +implementations. + +In the Python DOM API, accessor functions are not required. If provided, +they should take the form defined by the Python IDL mapping, but +these methods are considered unnecessary since the attributes are +accessible directly from Python. ``Set'' accessors should never be +provided for \keyword{readonly} attributes. + +The IDL definitions do not fully embody the requirements of the W3C DOM +API, such as the notion of certain objects, such as the return value of +\method{getElementsByTagName()}, being ``live''. The Python DOM API +does not require implementations to enforce such requirements. |