- Tech know how online

document type definition (XML) (DTD)

The Document Type Definition (DTD) defines for an XML document its allowed elements and related attributes. The document type declaration corresponds roughly to a vocabulary for a specific class of XML documents and its grammar also represents the rules for determining whether the content of a document is valid or invalid. With the help of a document type definition, it is therefore possible to check the validity of a document by ensuring that its structure corresponds to the DTD. A valid document is always well-formed.

However, the syntax of DTDs is only expressive enough to decide which elements can be the content of an XML document. However, document type declarations do not allow the content of the elements themselves to be differentiated. A precise differentiation of data types has only been realized with the XML Schema standardized by the World Wide Web Consortium(W3C).

If no document type definition or schema is defined for a document, it cannot bechecked for validity but only for well-formedness. In addition to DTDs and the W3C XML Schema, Relax NG (Regular Language for XML, New Generation) is another way to describe the structure of an XML document.

In the context of XML documents, a distinction must be made between well-formedness and validity:

  • Well-formedness refers to the syntactic correctness of an XML document.
  • An XML document is valid if its structure corresponds to the associated document type declaration or alternatively to the referenced schema. Valid XML documents are also always well-formed.

Once a specific DTD has been agreed for an XML document, it is the task of an XML parser to determine whether the document complies with the defined rules or not. For this purpose it is necessary that an XML document embeds the DTD or refers to this - then external DTD. A DTD always consists of a set of markup or tag declarations. In this way, it is possible to specify which elements and how they may be used in a document. The DTD defines therefore first of all restrictions for a well-formed XML document. In doing so, the procedure of the XML processor can dispense with the subsequent determination of validity, so that the status of well-formedness is still maintained.

In DTDs, the element types permitted in the document can be defined with their content models and attributes. A content model specifies the allowed content of an element. The XML specification specifies the following content models:

  • EMPTY, such an element has no content, but may have attributes.
  • ANY, if it is well-formed XML, the element can have any content.
  • #PCDATA, the element contains only character data
  • Mixed content, there elements can contain other subelements and character data.
  • Element content, such element contains subelements only.

Regarding the use of DTDs, the first thing to check is what data is involved. DTDs have only limited possibilities for special data formats - it is much better to use DTDs in the context of text documents. This also refers to the poor use of DTDs in relation to database systems, since the syntax of DTDs with the possible data types is not meaningful enough in this respect. For many applications, the use of DTDs to validate documents is not possible due to the following limitations:

  • DTDs use a specific - not XML - syntax. This means that separate tools are required.
  • No typing of data is supported.
  • There is only the data type #PCDATA.
  • Cardinality specifications can only be used insufficiently.
  • Lack of compatibility with XML namespaces
  • Poor possibilities for extensibility.
  • The definitions in DTDs are generally global, which contradicts an object-oriented modeling.

In the following however finally two well-known applications from the XML world, which work on the basis of DTDs.

SVG Scalable Vector Graphics( SVG) is a graphical format for a language that is defined by a DTD and is used to represent two-dimensional graphics. SVG is used, among other things, for the presentation of graphical content on cell phones or PDAs.

SMIL Synchronized Multimedia Integration Language( SMIL) is a language that was standardized by the W3C in a first version as early as 1998 and is used for interactive presentation from texts, videos or also images.

Englisch: document type definition (XML) - DTD
Updated at: 16.01.2013
#Words: 695
Links: document, class, content, syntax, data
Translations: DE

All rights reserved DATACOM Buchverlag GmbH © 2023