Document Object Model

The Document Object Model (DOM) is a cross-platform and language-independent interface that treats an XML or HTML document as a tree structure wherein each node is an object representing a part of the document. The DOM represents a document with a logical tree. Each branch of the tree ends in a node, and each node contains objects. DOM methods allow programmatic access to the tree; with them one can change the structure, style or content of a document. Nodes can have event handlers attached to them. Once an event is triggered, the event handlers get executed.[2]

The principal standardization of the DOM was handled by the World Wide Web Consortium, which last developed a recommendation in 2004. WHATWG took over development of the standard, publishing it as a living document. The W3C now publishes stable snapshots of the WHATWG standard.

Document Object Model
Example of DOM hierarchy in an HTML document
First publishedOctober 1, 1998
Latest versionDOM4[1]
November 19, 2015
OrganizationWorld Wide Web Consortium, WHATWG
Base standardsWHATWG DOM Living Standard


The history of the Document Object Model is intertwined with the history of the "browser wars" of the late 1990s between Netscape Navigator and Microsoft Internet Explorer, as well as with that of JavaScript and JScript, the first scripting languages to be widely implemented in the JavaScript engines of web browsers.

JavaScript was released by Netscape Communications in 1995 within Netscape Navigator 2.0. Netscape's competitor, Microsoft, released Internet Explorer 3.0 the following year with a reimplementation of JavaScript called JScript. JavaScript and JScript let web developers create web pages with client-side interactivity. The limited facilities for detecting user-generated events and modifying the HTML document in the first generation of these languages eventually became known as "DOM Level 0" or "Legacy DOM." No independent standard was developed for DOM Level 0, but it was partly described in the specifications for HTML 4.

Legacy DOM was limited in the kinds of elements that could be accessed. Form, link and image elements could be referenced with a hierarchical name that began with the root document object. A hierarchical name could make use of either the names or the sequential index of the traversed elements. For example, a form input element could be accessed as either document.formName.inputName or document.forms[0].elements[0].

The Legacy DOM enabled client-side form validation and the popular "rollover" effect.

In 1997, Netscape and Microsoft released version 4.0 of Netscape Navigator and Internet Explorer respectively, adding support for Dynamic HTML (DHTML) functionality enabling changes to a loaded HTML document. DHTML required extensions to the rudimentary document object that was available in the Legacy DOM implementations. Although the Legacy DOM implementations were largely compatible since JScript was based on JavaScript, the DHTML DOM extensions were developed in parallel by each browser maker and remained incompatible. These versions of the DOM became known as the "Intermediate DOM."

After the standardization of ECMAScript, the W3C DOM Working Group began drafting a standard DOM specification. The completed specification, known as "DOM Level 1", became a W3C Recommendation in late 1998. By 2005, large parts of W3C DOM were well-supported by common ECMAScript-enabled browsers, including Microsoft Internet Explorer version 6 (from 2001), Opera, Safari and Gecko-based browsers (like Mozilla, Firefox, SeaMonkey and Camino).


The W3C DOM Working Group published its final recommendation and subsequently disbanded in 2004. Development efforts migrated to the WHATWG, which continues to maintain a living standard.[3] In 2009, the Web Applications group reorganized DOM activities at the W3C.[4] In 2013, due to a lack of progress and the impending release of HTML5, the DOM Level 4 specification was reassigned to the HTML Working Group to expedite its completion.[5] Meanwhile, in 2015, the Web Applications group was disbanded and DOM stewardship passed to the Web Platform group.[6] Beginning with the publication of DOM Level 4 in 2015, the W3C creates new recommendations based on snapshots of the WHATWG standard.

  • DOM Level 1 provided a complete model for an entire HTML or XML document, including the means to change any portion of the document.
  • DOM Level 2 was published in late 2000. It introduced the getElementById function as well as an event model and support for XML namespaces and CSS.
  • DOM Level 3, published in April 2004, added support for XPath and keyboard event handling, as well as an interface for serializing documents as XML.
  • DOM Level 4 was published in 2015. It is a snapshot of the WHATWG living standard.[7]


Web browsers

To render a document such as a HTML page, most web browsers use an internal model similar to the DOM. The nodes of every document are organized in a tree structure, called the DOM tree, with the topmost node named as "Document object". When an HTML page is rendered in browsers, the browser downloads the HTML into local memory and automatically parses it to display the page on screen.[8]


When a web page is loaded, the browser creates a Document Object Model of the page, which is an object oriented representation of an HTML document, that acts as an interface between JavaScript and the document itself and allows the creation of dynamic web pages:[9]

  • JavaScript can add, change, and remove all of the HTML elements and attributes in the page.
  • JavaScript can change all of the CSS styles in the page.
  • JavaScript can react to all the existing events in the page.
  • JavaScript can create new events within the page.


Because the DOM supports navigation in any direction (e.g., parent and previous sibling) and allows for arbitrary modifications, an implementation must at least buffer the document that has been read so far (or some parsed form of it).

Layout engines

Web browsers rely on layout engines to parse HTML into a DOM. Some layout engines, such as Trident/MSHTML, are associated primarily or exclusively with a particular browser, such as Internet Explorer. Others, including Blink, WebKit, and Gecko, are shared by a number of browsers, such as Google Chrome, Opera, Safari, and Firefox. The different layout engines implement the DOM standards to varying degrees of compliance.


DOM implementations:

  • libxml2
  • Xerces is a collection of DOM implementations written in C++, Java and Perl
  • XML for <SCRIPT> is a JavaScript-based DOM implementation[10]
  • PHP.Gt DOM is based on libxml2 and brings DOM level 4 compatibility[11] to the PHP programming language
  • Domino is a Server-side (Node.js) DOM implementation based on Mozilla's dom.js. Domino is used in the MediaWiki stack with Visual Editor.

APIs that expose DOM implementations:

  • JAXP (Java API for XML Processing) is an API for accessing DOM providers
  • Lazarus (Free Pascal IDE) contains two variants of the DOM - with UTF-8 and ANSI format

Inspection tools:


  1. ^ All versioning refers to W3C DOM only.
  2. ^ "Document Object Model (DOM)". W3C. Retrieved 2012-01-12. The Document Object Model is a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents.
  3. ^ "DOM Standard". Retrieved 23 September 2016.
  4. ^ "W3C Document Object Model". Retrieved 23 September 2016.
  5. ^ (, Philippe Le Hegaret. "New Charter for the HTML Working Group from Philippe Le Hegaret on 2013-09-30 ( from September 2013)". Retrieved 23 September 2016.
  6. ^ "PubStatus - WEBAPPS". Retrieved 23 September 2016.
  7. ^ "W3C DOM4". Retrieved 23 September 2016.
  8. ^ "A Survey of Techniques for Improving Efficiency of Mobile Web Browsing", Concurrency and Computation, 2018
  9. ^ "JavaScript HTML DOM". Retrieved 23 September 2016.
  10. ^ "XML for <SCRIPT> Cross Platform XML Parser in JavaScript". Retrieved 23 September 2016.
  11. ^ "The modern DOM API for PHP 7 projects".


External links

Browser Object Model

The Browser Object Model (BOM) is a browser-specific convention referring to all the objects exposed by the web browser. Unlike the Document Object Model, there is no standard for implementation and no strict definition, so browser vendors are free to implement the BOM in any way they wish.

That which we see as a window displaying a document, the browser program sees as a hierarchical collection of objects. When the browser parses a document, it creates a collection of objects that define the document and detail how it should be displayed. The object the browser creates is known as the Document object. It is part of a larger collection of objects that the browser makes use of. This collection of browser objects is collectively known as the Browser Object Model, or BOM.

The top level of the hierarchy is the window object, which contains the information about the window displaying the document. Some of the window object are objects themselves that describe the document and related information.

Browser sniffing

Browser sniffing (also known as browser detection) is a set of techniques used in websites and web applications in order to determine the web browser a visitor is using, and to serve browser-appropriate content to the visitor. This practice is sometimes used to circumvent incompatibilities between browsers due to misinterpretation of HTML, Cascading Style Sheets (CSS), or the Document Object Model (DOM). While the World Wide Web Consortium maintains up-to-date central versions of some of the most important Web standards in the form of recommendations, in practice no software developer has designed a browser which adheres exactly to these standards; implementation of other standards and protocols, such as SVG and XMLHttpRequest, varies as well. As a result, different browsers display the same page differently, and so browser sniffing was developed to detect the web browser in order to help ensure consistent display of content.

It is also used to detect mobile browsers and send them mobile-optimized websites.

Comparison of JavaScript engines (DOM support)

The following tables compare Document Object Model (DOM) compatibility and support for a number of JavaScript engines used in web browsers.

For features that are fully supported (based on DOM Level 2 or DOM Level 3 modules that are under W3C Recommendation), an exact version number is given if it is certain that the feature was added in such version. DOM Level 0 and DOM Level 3 modules that are still under development are not included.

DOM Inspector

DOM Inspector (DOMi) is a web developer tool created by Joe Hewitt and was originally included in Mozilla Application Suite as well as versions of Mozilla Firefox prior to Firefox 3. It is now included by default in SeaMonkey and is an installable extension for subsequent versions of Firefox and other Mozilla-based applications. Its main purpose is to inspect and edit the Document Object Model (DOM) tree of HTML and XML-based documents.

A DOM node can be selected from the tree structure, or by clicking on the browser chrome. As well as the DOM tree viewer, other viewers are also available, including Box Model, XBL Bindings, CSS Rules, Style Sheets, Computed Style, JavaScript Object, as well as a number of viewers for document and application accessibility. By default, the DOM Inspector highlights a newly selected non-attribute node with a red flashing border.

Similar tools exist in other browsers, e.g., Opera's Dragonfly, Safari's Web Inspector, the Internet Explorer Developer Toolbar, and Google Chrome's Developer Tools.

DOM events

DOM (Document Object Model) events allow event-driven programming languages like JavaScript, JScript, ECMAScript, VBScript, and Java to register various event handlers or listeners on the element nodes inside a DOM tree, such as in HTML, XHTML, XUL, and SVG documents.

Historically, like DOM, the event models used by various web browsers had some significant differences. This caused compatibility problems. To combat this, the event model was standardized by the World Wide Web Consortium (W3C) in DOM Level 2.


DVB-HTML, or Digital Video Broadcast HyperText Markup Language, is a standard for allowing digital televisions to access web content. It is an optional part of the larger MHP1.1 standard of DVB.

The specification is based on a modularized version of XHTML 1.1, and also includes Cascading Style Sheets (CSS) 2.0, Document Object Model (DOM) 2.0, and ECMAScript (also known as JavaScript).

Dynamic HTML

Dynamic HTML, or DHTML, is an umbrella term for a collection of technologies used together to create interactive and animated websites by using a combination of a static markup language (such as HTML), a client-side scripting language (such as JavaScript), a presentation definition language (such as CSS), and the Document Object Model (DOM). The application of DHTML was introduced by Microsoft with the release of Internet Explorer 4 in 1997.

DHTML allows scripting languages to change variables in a web page's definition language, which in turn affects the look and function of otherwise "static" HTML page content, after the page has been fully loaded and during the viewing process. Thus the dynamic characteristic of DHTML is the way it functions while a page is viewed, not in its ability to generate a unique page with each page load.

By contrast, a dynamic web page is a broader concept, covering any web page generated differently for each user, load occurrence, or specific variable values. This includes pages created by client-side scripting, and ones created by server-side scripting (such as PHP, Python, JSP or ASP.NET) where the web server generates content before sending it to the client.

DHTML is differentiated from Ajax by the fact that a DHTML page is still request/reload-based. With DHTML, there may not be any interaction between the client and server after the page is loaded; all processing happens in JavaScript on the client side. By contrast, an Ajax page uses features of DHTML to initiate a request (or 'subrequest') to the server to perform additional actions. For example, if there are multiple tabs on a page, pure DHTML approach would load the contents of all tabs and then dynamically display only the one that is active, while AJAX could load each tab only when it is really needed.


JDOM is an open-source Java-based document object model for XML that was designed specifically for the Java platform so that it can take advantage of its language features. JDOM integrates with Document Object Model (DOM) and Simple API for XML (SAX), supports XPath and XSLT. It uses external parsers to build documents. JDOM was developed by Jason Hunter and Brett McLaughlin starting in March 2000. It has been part of the Java Community Process as JSR 102, though that effort has since been abandoned.

JavaScript engine

A JavaScript engine is a computer program that executes JavaScript (JS) code. The first JS engines were mere interpreters, but all relevant modern engines utilize just-in-time compilation for improved performance.JS engines are developed by web browser vendors, and every major browser has one. In a browser, the JS engine runs in concert with the rendering engine via the Document Object Model (DOM).

The use of JS engines is not limited to browsers. For example, the Chrome V8 engine is a core component of the popular Node.js runtime system.

Java API for XML Processing

In computing, the Java API for XML Processing, or JAXP ( JAKS-pee), one of the Java XML Application programming interfaces (API)s, provides the capability of validating and parsing XML documents. It has three basic parsing interfaces:

the Document Object Model parsing interface or DOM interface

the Simple API for XML parsing interface or SAX interface

the Streaming API for XML or StAX interface (part of JDK 6; separate jar available for JDK 5)In addition to the parsing interfaces, the API provides an XSLT interface to provide data and structural transformations on an XML document.

JAXP was developed under the Java Community Process as JSR 5 (JAXP 1.0), JSR 63 (JAXP 1.1 and 1.2), and JSR 206 (JAXP 1.3).

JAXP version 1.4.4 was released on September 3, 2010. JAXP 1.3 was declared end-of-life on February 12, 2008.

Komodo IDE

Komodo IDE is an integrated development environment (IDE) for dynamic programming languages. It was introduced in May 2000. Many of Komodo's features are derived from an embedded Python interpreter.Komodo IDE uses the Mozilla and Scintilla code base as they share many features and support the same languages (including Python, Perl, PHP, Ruby, Tcl, SQL, Smarty, CSS, HTML and XML) and operating systems (Linux, OS X, and Windows). The editor component is implemented using the Netscape Plugin Application Programming Interface (NPAPI), with the Scintilla view embedded in the XML User Interface Language (XUL) interface in the same manner as a web browser plugin.

Komodo IDE has an open-source counterpart called Komodo Edit. Both share much of the same code base, Komodo IDE containing the more advanced IDE features such as debugging, unit testing, etc.

Both Komodo Edit and IDE support user customizing via plug-ins and macros. Komodo plug-ins are based on Mozilla Add-ons and extensions can be searched for, downloaded, configured, installed and updated from within the application. Available extensions include a Document Object Model (DOM) inspector, pipe features, additional language support and user interface enhancements.

Komodo IDE has features such as integrated debugger support, DOM viewer, interactive shells, source code control integration, and the ability to select the engine used to run regular expressions, to ensure compatibility with the final deployment target. The commercial version also adds code browsing, a database explorer, collaboration, support for many popular source code control systems, and more. Independent implementations of some of these features, such as the database editor, git support, and remote FTP file access, are available in the free version via Komodo Edit's plugin system.

Mariner (browser engine)

Mariner was a canceled project to add performance and stability enhancements to the browser engine used in the Netscape Communicator web browser. Mariner became open source in March 1998 when Netscape released its client code and started the Mozilla project.

Mariner added support for page reflow, a feature lacking in previous Netscape releases, making the layout of text and tables much faster. In addition, development work was done on a Document Object Model (level 1) and stability was improved. Enhancements to HTML and CSS were also made but these were not technically part of the Mariner project.

The original intention was to ship Mariner in Netscape Communicator 5.0, with subsequent releases using the newer NGLayout engine (now called Gecko). However, in October 1998, Netscape decided to abandon the old layout engine in favour of NGLayout and work on Mariner ceased. Netscape Communicator 5.0 and Mariner never shipped. The next major Netscape revision (Netscape 6, released in November 2000) was built around Gecko.


MenuBox is a discontinued web browser developed by Cloanto Corporation. It is based on the Trident layout engine, to which it adds an extended document object model (DOM) and event intercepts to achieve special functionality for use in contexts such as AutoRun projects, wrapping of web-based services, chromeless applications and kiosk mode operation.

A MenuBox project consists of a single, redistributable binary file (MenuBox.exe, may also be renamed), one configuration file (in INI format, may be merged into Autorun.inf) and the actual content files (HTML, scripts, images, etc.)

The MenuBox software first launched in 1997. HTML support was introduced in version 2.0, which was released on September 22, 2002. As of October 10, 2009, MenuBox was still listed as the only third-party browser to have passed formal "Certified for Windows Vista" testing.

Object model

In computing, object model has two related but distinct meanings:

The properties of objects in general in a specific computer programming language, technology, notation or methodology that uses them. Examples are the object models of Java, the Component Object Model (COM), or Object-Modeling Technique (OMT). Such object models are usually defined using concepts such as class, generic function, message, inheritance, polymorphism, and encapsulation. There is an extensive literature on formalized object models as a subset of the formal semantics of programming languages.

A collection of objects or classes through which a program can examine and manipulate some specific parts of its world. In other words, the object-oriented interface to some service or system. Such an interface is said to be the object model of the represented service or system. For example, the Document Object Model (DOM) [1] is a collection of objects that represent a page in a web browser, used by script programs to examine and dynamically change the page. There is a Microsoft Excel object model [2] for controlling Microsoft Excel from another program, and the ASCOM Telescope Driver [3] is an object model for controlling an astronomical telescope.An object model consists of the following important features:

Object Reference

Objects can be accessed via object references. To invoke a method in an object, the object reference and method name are given, together with any arguments.Interfaces

An interface provides a definition of the signature of a set of methods without specifying their implementation. An object will provide a particular interface if its class contains code that implement the method of that interface. An interface also defines types that can be used to declare the type of variables or parameters and return values of methods.Actions

An action in object-oriented programming (OOP) is initiated by an object invoking a method in another object. An invocation can include additional information needed to carry out the method. The receiver executes the appropriate method and then returns control to the invoking object, sometimes supplying a result.Exceptions

Programs can encounter various errors and unexpected conditions of varying seriousness. During the execution of the method many different problems may be discovered. Exceptions provide a clean way to deal with error conditions without complicating the code. A block of code may be defined to throw an exception whenever particular unexpected conditions or errors arise. This means that control passes to another block of code that catches the exception.

Parser (CGI language)

Parser is a free server-side CGI web scripting language developed by Art. Lebedev Studio and released under the GPL.

Originally, Parser was merely a simple macro processing language. The latest 3rd revision (March 2006) introduced object-oriented programming features.

The compiler for the language was developed in C++ by studio employees Konstantin Morshnev and Alexander Petrosyan to automate often repeated tasks, especially maintenance of already existing websites. It was used in many web projects of the studio. Since revision 3 it was released as free software and it is now used in other websites, mostly in Russia (according to a partial list at the language's website).

The language supports technologies needed for common web design tasks: XML, Document Object Model (DOM), Perl Compatible Regular Expressions (PCRE) and others.

Processing Instruction

A Processing Instruction (PI) is an SGML and XML node type, which may occur anywhere in the document, intended to carry instructions to the application.Processing instructions are exposed in the Document Object Model as Node.PROCESSING_INSTRUCTION_NODE, and they can be used in XPath and XQuery with the 'processing-instruction()' command. is a JavaScript library built on the Prototype JavaScript Framework, providing dynamic visual effects and user interface elements via the Document Object Model (DOM).

It is most notably included with Ruby on Rails and Seaside, but also provided separately to work with other web frameworks and scripting languages. was extracted by Thomas Fuchs from his work on fluxiom, a web based digital asset management tool by the design company wollzelle. It was first released to the public in June 2005.


WebAssembly (often shortened to Wasm) is an open standard that defines a portable binary code format for executable programs, and a corresponding textual assembly language, as well as interfaces for facilitating interactions between such programs and their host environment. The main goal of WebAssembly is to enable high performance applications on web pages, but the format is designed to be executed and integrated in other environments as well.Wasm does not replace JavaScript; in order to use WASM in browsers, users may use Emscripten SDK to compile C++ (or any other LLVM-supported language such as D or Rust) source code into a binary file which runs in the same sandbox as regular JavaScript code; Emscripten provides bindings for several commonly used environment interfaces like WebGL; it has only access to an expandable memory and a small number of scalar values. There is no direct Document Object Model (DOM) access; however, it is possible to create proxy functions for this, for example through stdweb, web_sys, and js_sys.The World Wide Web Consortium (W3C) maintains the standard with contributions from Mozilla, Microsoft, Google, and Apple.

Web storage

Web storage, sometimes known as DOM storage (Document Object Model storage), provides web application software methods and protocols used for storing data in a web browser. Web storage supports persistent data storage, similar to cookies but with a greatly enhanced capacity and no information stored in the HTTP request header. There are two main web storage types: local storage and session storage, behaving similarly to persistent cookies and session cookies respectively.

All major browsers support Web storage, which is standardized by the World Wide Web Consortium (W3C).

Code analysis
Doc generators
Editors (comparison)
Related technologies
Package managers
Unit testing
Products and

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.