Open Document Architecture

The Open Document Architecture (ODA) and interchange format (informally referred to as just ODA) is a free and open international standard document file format maintained by the ITU-T to replace all proprietary document file formats. ODA is detailed in the standards documents CCITT T.411-T.424, which is equivalent to ISO 8613.

Open Document Architecture
X-office-document.svg
Developed byITU-T, ISO
Initial release1989
Type of formatDocument file format
StandardCCITT T.411-T.424, ISO 8613
WebsiteISO 8613

Format

ODA defines a compound document format that can contain raw text, raster images and vector graphics. In the original release the difference between this standard and others like it is that the graphics structures were exclusively defined as CCITT raster image and Computer Graphics Metafile (CGM - ISO 8632). This was to limit the problem of having word processor and desktop publisher software be required to interpret all known graphics formats.

The documents have both logical and layout structures. Logically the text can be partitioned into chapters, footnotes and other subelements akin to HTML, and the layout fill a function similar to Cascading Style Sheets in the web world. The binary transport format for an ODA-conformant file is called Open Document Interchange Format (ODIF) and is based on the Standard Generalized Markup Language and Abstract Syntax Notation One (ASN.1).

One of the features of this standard could be stored or interchanged in one of three formats: Formatted, Formatted Processable, or Processable. The latter two are editable formats. The first is an uneditable format that is logically similar to Adobe Systems PDF that is in common use today.

History

In 1985, ESPRIT financed a pilot implementation of the ODA concept, involving, among others, Bull corporation, Olivetti, ICL and Siemens AG.

The intent was to have a universal storable and interchangeable document structure that would not go out of date and could be used by any word processor or desktop publisher. The rapid adoption of personal computers in the late 1970s and early 1980s by consumers and small businesses and the relative ease of writing applications for the primitive early PCs had resulted in a huge number of new word processing applications that were then duking it out around the world for market dominance. At the same time, large corporations who had purchased dedicated word processor devices in the 1970s were switching over to the new PCs that could run word processing software and much more. The result was a profusion of constantly evolving proprietary file formats. It was already clear by 1985 that this confusing and often frustrating situation would get much worse before it got better, as desktop publishing and multimedia computing were already on the horizon.

Thus, ODA was intended to solve the problem of software applications whose developers were continually updating their native file formats to accommodate new features, which frequently broke backward compatibility. Older native formats were repeatedly becoming obsolete and therefore unusable after only a few years. This led to a large financial impact on companies that were using ad hoc standard applications, such as Microsoft Word or WordPerfect, because their IT departments had to constantly assist frustrated users with transferring content between so many different formats, and also hire employees whose sole job was to import old stored documents into the latest version of applications before they became unreadable. The intended result of the ODA standard was that companies would not have to commit to an ad hoc standard for word processor or desktop publisher applications, because any application adhering to a common open standard could be used to read and edit long stored documents.

The initial round of documents that made up ISO 8613 was completed after a multi-year effort at an ISO/IEC JTC1/SC18/WG3 meeting in Paris La Defense, France, around Armistice (Nov. 11) 1987, called "Office Document Architecture" at the time. CCITT picked them up as the T.400 series of recommendations, using the term "Open Document Architecture". Work continued on additional parts for a while, for instance at an ISO working group meeting in Ottawa in February 1989. Improvements and additions were continually being made. The revised standard was finally published in 1999. However, no significant developer of document application software chose to support the format, probably because the conversion from the existing dominant word processor formats such as WordPerfect and Microsoft Word was difficult, offered little fidelity, and would only have weakened their advantage of vendor lock-in over their existing user base. There were also cultural obstacles because ODA was a predominantly European project that took a top-down design approach. It was unable to garner significant interest from the American software developer community or trade press. Finally, it took an extraordinarily long time to release the ODA format (the pilot was financed in 1985, but the final specification not published until 1999). Given a lack of products that supported the format, in part because of the excessive time used to create the specification, few users were interested in using it. Eventually interest in the format faded.

IBM's European Networking Center (ENC) in Heidelberg, Germany, developed prototype extensions to IBM OfficeVision/VM to support ODA, in particular a converter between ODA and Document Content Architecture (DCA) document formats.[1]

It would be improper to call ODA anything but a failure, but its spirit clearly influenced latter-day document formats that were successful in gaining support from many document software developers and users. These include the already-mentioned HTML and CSS as well as XML and XSL leading up to OpenDocument and Office Open XML.

See also

References

  1. ^ Fanderl, H.; Fischer, K.; Kmper, J. (1992). "The Open Document Architecture: From standardization to the market". IBM Systems Journal. 31 (4): 728–754. doi:10.1147/sj.314.0728. ISSN 0018-8670.

External links

The standard itself was made available for free download on September 7, 2007 (the "missing" documents T.420 and T.423 do not exist):

ANSI escape code

ANSI escape sequences are a standard for in-band signaling to control the cursor location, color, and other options on video text terminals and terminal emulators. Certain sequences of bytes, most starting with Esc and '[', are embedded into the text, which the terminal looks for and interprets as commands, not as character codes.

ANSI sequences were introduced in the 1970s to replace vendor-specific sequences and became widespread in the computer equipment market by the early 1980s. They were used in development, scientific and commercial applications and later by the nascent bulletin board systems to offer improved displays compared to earlier systems lacking cursor movement, a primary reason they became a standard adopted by all manufacturers.

Although hardware text terminals have become increasingly rare in the 21st century, the relevance of the ANSI standard persists because most terminal emulators interpret at least some of the ANSI escape sequences in output text. A notable exception was DOS and older versions of the Win32 console of Microsoft Windows.

Compound document

In computing, a compound document is a document type typically produced using word processing software, and is a regular text document intermingled with non-text elements such as spreadsheets, pictures, digital videos, digital audio, and other multimedia features. It can also be used to collect several documents into one.

Compound document technologies are commonly utilized on top of a software componentry framework, but the idea of software componentry includes several other concepts apart from compound documents, and software components alone do not enable compound documents. Well-known technologies for compound documents include:

ActiveX Documents

Bonobo by Ximian (primarily used by GNOME)

KParts in KDE

Mixed Object Document Content Architecture

Multipurpose Internet Mail Extensions (MIME)

Object linking and embedding (OLE) by Microsoft; see Compound File Binary Format

Open Document Architecture from ITU-T (not used)

OpenDoc by Apple Computer (now defunct)

Verdantium

XML and XSL are encapsulation formats used for compound documents of all kindsThe first public implementation was on the Xerox Star workstation, released in 1981.

Document file format

A document file format is a text or binary file format for storing documents on a storage media, especially for use by computers.

There currently exists a multitude of incompatible document file formats.

A rough consensus has been established that XML is to be the technical basis for future document file formats, although PDF is likely to remain the format of choice for fixed-layout documents. Examples of XML-based open standards are DocBook, XHTML, and, more recently, the ISO/IEC standards OpenDocument (ISO 26300:2006) and Office Open XML (ISO 29500:2008).

In 1993, the ITU-T tried to establish a standard for document file formats, known as the Open Document Architecture (ODA) which was supposed to replace all competing document file formats. It is described in ITU-T documents T.411 through T.421, which are equivalent to ISO 8613. It did not succeed.

Page description languages such as PostScript and PDF have become the de facto standard for documents that a typical user should only be able to create and read, not edit. In 2001, a series of ISO/IEC standards for PDF began to be published, including the specification for PDF itself, ISO-32000.

HTML is the most used and open international standard and it is also used as document file format. It has also become ISO/IEC standard (ISO 15445:2000).

The default binary file format used by Microsoft Word (.doc) has become widespread de facto standard for office documents, but it is a proprietary format and is not always fully supported by other word processors.

Document layout analysis

In computer vision, document layout analysis is the process of identifying and categorizing the regions of interest in the scanned image of a text document. A reading system requires the segmentation of text zones from non-textual ones and the arrangement in their correct reading order. Detection and labeling of the different zones (or blocks) as text body, illustrations, math symbols, and tables embedded in a document is called geometric layout analysis. But text zones play different logical roles inside the document (titles, captions, footnotes, etc.) and this kind of semantic labeling is the scope of the logical layout analysis.

Document layout analysis is the union of geometric and logical labeling. It is typically performed before a document image is sent to an OCR engine, but it can be used also to detect duplicate copies of the same document in large archives, or to index documents by their structure or pictorial content.

Document layout is formally defined in the international standard ISO 8613-1:1989.

Enhanced publication

Enhanced publications or enhanced ebooks are a form of electronic publishing for the dissemination and sharing of research outcomes, whose first formal definition can be tracked back to 2009. As many forms of digital publications, they typically feature a unique identifier (possibly a persistent identifier) and descriptive metadata information. Unlike traditional digital publications (e.g. PDF article), enhanced publications are often tailored to serve specific scientific domains and are generally constituted by a set of interconnected parts corresponding to research assets of several kinds (e.g. datasets, videos, images, stylesheets, services, workflows, databases, presentations) and to textual descriptions of the research (e.g. papers, chapters, sections, tables). The nature and format of such parts and of the relationships between them, depends on the application domain and may largely vary from case to case.

The main motivations behind enhanced publications are to be found in the limitations of traditional scientific literature to describe the whole context and outcome of a research activity. Their goal is to move "beyond the simple PDF" (FORCE11 initiative) and support scientists with advanced ICT tools for sharing their research more comprehensively, without losing the narrative spirit of "the publication" as dissemination means. This trend is confirmed by the several enhanced publication systems devised in the literature, offering to research communities one or more of the following functionalities: Packaging of related research assets; Web 2.0 reading capabilities; Interlinking research outputs; Re-production and assessment of scientific experiments.

European Strategic Program on Research in Information Technology

European Strategic Programme on Research in Information Technology (ESPRIT) was a series of integrated programmes of information technology research and development projects and industrial technology transfer measures. It was a European Union initiative managed by the Directorate General for Industry (DG III) of the European Commission.

IBM OfficeVision

OfficeVision is an IBM proprietary office support application that primarily runs on IBM's VM operating system and its user interface CMS. Other platform versions are available, notably OV/MVS and OV/400. OfficeVision provides e-mail, shared calendars, and shared document storage and management, and it provides the ability to integrate word processing applications such as Displaywrite/370 and/or the Document Composition Facility (DCF/SCRIPT). IBM introduced OfficeVision in their May 1989 announcement, followed by several other key releases later.

The advent of the personal computer and the client–server paradigm changed the way organizations looked at office automation. In particular, office users wanted graphical user interfaces. Thus e-mail applications with PC clients became more popular.

IBM's initial answer was OfficeVision/2, a server-requestor system designed to be the strategic implementation of IBM's Systems Application Architecture. The server could run on OS/2, VM, MVS (XA or ESA), or OS/400, while the requester required OS/2 Extended Edition running on IBM PS/2 personal computers, or DOS. IBM also developed OfficeVision/2 LAN for workgroups, which failed to find market acceptance and was withdrawn in 1992. IBM began to resell Lotus Notes and Lotus cc:Mail as an OfficeVision/2 replacement. Ultimately, IBM solved its OfficeVision problems through the hostile takeover of Lotus Software for its Lotus Notes product, one of the two most popular products for business e-mail and calendaring.

IBM originally intended to deliver the Workplace Shell as part of the OfficeVision/2 LAN product, but in 1991 announced plans to release it as part of OS/2 2.0 instead.Users of IBM OfficeVision included the New York State Legislature.

ITU-T

The ITU Telecommunication Standardization Sector (ITU-T) is one of the three sectors (divisions or units) of the International Telecommunication Union (ITU); it coordinates standards for telecommunications.

The standardization efforts of ITU commenced in 1865 with the formation of the International Telegraph Union (ITU). ITU became a specialized agency of the United Nations in 1947. The International Telegraph and Telephone Consultative Committee (CCITT, from French: Comité Consultatif International Téléphonique et Télégraphique) was created in 1956, and was renamed ITU-T in 1993.ITU-T has a permanent secretariat, the Telecommunication Standardization Bureau (TSB), based at the ITU headquarters in Geneva, Switzerland. The current Director of the Bureau is Chaesub Lee, whose 4-year term commenced on 1 January 2015, who replaced Malcolm Johnson of the United Kingdom, who was director from 1 January 2007 to 2014.

List of International Organization for Standardization standards, 8000-8999

This is a list of published International Organization for Standardization (ISO) standards and other deliverables. For a complete and up-to-date list of all the ISO standards, see the ISO catalogue.The standards are protected by copyright and most of them must be purchased. However, about 300 of the standards produced by ISO and IEC's Joint Technical Committee 1 (JTC1) have been made freely and publicly available.

Outline of the United Nations

The following outline is provided as an overview of and topical guide to the United Nations:

United Nations – international organization whose stated aims are facilitating cooperation in international law, international security, economic development, social progress, human rights, and achievement of world peace. The UN was founded in 1945 after World War II to replace the League of Nations, to stop wars between countries, and to provide a platform for dialogue. It contains multiple subsidiary organizations to carry out its missions.

Raster Document Object

The .RDO (Raster Document Object) file format is the native format used by Xerox's DocuTech range of hardware and software, that underpins the company's "Xerox Document On Demand" "XDOD" systems. It is therefore a significant file format for the "print on demand" market sector, along with PostScript and PDF.

RDO is a metafile format based on the Open Document Architecture (ODA) specifications: In Xerox's RDO implementation, description and control information is stored within the RDO file, while raster images are stored separately, usually in a separate folder, as TIFF files. The RDO file dictates which bitmap images will be used on each page of a document, and where they will be placed.

IEC standards
ISO/IEC standards
Related
ISO standards by standard number
1–9999
10000–19999
20000+
Editable document formats
Fixed document formats
Related topics

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.