Document Object Model (DOM) is a platform as well as language independent interface description for accessing the elements of HTML documents or XML documents. DOM was defined by the Word Wide Web Consortium(W3C) and presented in three levels, each of which consists of several modules. The levels Legacy DOM, W3C DOM and IE 4 DOM are differentiated, which are supported by the different browsers in different ways.
A similarly abstract interface to Document Object Model (DOM) is Simple API for XML( SAX), which can also be used to access XML documents. A combination of the two technologies is also quite possible - with SAX, for example, a document can first be prepared so that it can then be processed further with the DOM model.
Uniform interfaces for web documents
The starting point for the development of DOM by the W3C was the intention to realize uniform interfaces - also Application Programming Interface (API) - for the access to web documents such as the Extensible Markup Language (XML) or the Hypertext Markup Language (HTML). The reason for this were the individual efforts of some browser manufacturers, which got out of hand in the past and led to incompatibilities. To date, there are three versions of DOM, referred to by the W3C as levels, each containing several modules:
Level 1 Defines the core of DOM and supports XML 1.0 and HTML 4.0.
Level 3: Revision of exception handling, loading and saving of XML documents, support for XPath.
Finally, the result of these levels is a number of interfaces defined using the Interface Definition Language( IDL) developed by the Object Management Group( OMG), whose advantage is to be completely language independent. With a view to easy integration into a specific environment, additional language mappings for Java and ECMAScript, for example, have been provided by the OMG. So-called bindings also regulate for other languages how the DOM interfaces are integrated and implemented in a specific development environment.
The different manufacturers of browsers such as Netscape Navigator, MozillaFirefox or Opera support the DOM in various forms or supplement the DOM such as Internet Explorer, for example, with elements for modifying the content of documents.
Interfaces as information units
The interfaces of the DOM represent the information units of an HTML or XML document as a hierarchical tree of objects, also called nodes. The tree is made available in memory to realize fast, direct query and manipulation of these nodes. For this set of objects ordered by DOM, an abstract superclass with a certain number of attributes and methods is provided in each case. DOM makes available starting from a special node object different further objects, like among other things
- Document: represents the entire document,
- NodeList: represents a list of nodes,
- Element: represents an element node,
- Attr: represents an attribute node and
- Tex:, Represents a text node.
The Node interface represents a single node in a tree. In general, all objects implement these interfaces, for which a general set of attributes, constants, and methods are specified to support the handling of a node. This allows operations to be applied uniformly to the DOM tree regardless of a particular node type.
Classic applications for the DOM to handle XML and HTML documents are:
- Reading a document for read and write operations or searching for specific content,
- Structuring and sorting,
- Creation of DOM trees by an application and transformation into an XML document,
- separation of references between individual XML elements,
- Creation of modified DOM trees,
- Event handling software components can be connected to the event mechanisms of DOM trees.