RELAX NG

In computing, RELAX NG (REgular LAnguage for XML Next Generation) is a schema language for XML—a RELAX NG schema specifies a pattern for the structure and content of an XML document. A RELAX NG schema is itself an XML document but RELAX NG also offers a popular compact, non-XML syntax.[1] Compared to other XML schema languages RELAX NG is considered relatively simple.

It was defined by a committee specification of the OASIS RELAX NG technical committee in 2001 and 2002, based on Murata Makoto's RELAX and James Clark's TREX,[2][3][4] and also by part two of the international standard ISO/IEC 19757: Document Schema Definition Languages (DSDL).[5][6] ISO/IEC 19757-2 was developed by ISO/IEC JTC1/SC34 and published in its first version in 2003.[7]

RELAX NG
Filename extension.rng
Internet media typeapplication/xml, text/xml
Type of formatXML Schema language
Extended fromXML

Schema examples

Suppose we want to define an extremely simple XML markup scheme for a book: a book is defined as a sequence of one or more pages; each page contains text only. A sample XML document instance might be:

<book>
  <page>This is page one.</page>
  <page>This is page two.</page>
</book>

XML syntax

A RELAX NG schema can be written in a nested structure by defining a root element that contains further element definitions, which may themselves contain embedded definitions. A schema for our book in this style, using the full XML syntax, would be written:

<element name="book" xmlns="http://relaxng.org/ns/structure/1.0">
   <oneOrMore>
      <element name="page">
         <text/>
      </element>
   </oneOrMore>
</element>

Nested structure becomes unwieldy with many sublevels and cannot define recursive elements, so most complex RELAX NG schemas use references to named pattern definitions located separately in the schema. Here, a "flattened schema" defines precisely the same book markup as the previous example:

<grammar xmlns="http://relaxng.org/ns/structure/1.0">
   <start>
      <element name="book">
         <oneOrMore>
            <ref name="page"/>
         </oneOrMore>
      </element>
   </start>
   <define name="page">
      <element name="page">
         <text/>
      </element>
   </define>
</grammar>

Compact syntax

RELAX NG compact syntax is a non-XML format inspired by extended Backus-Naur form and regular expressions, designed so that it can be unambiguously translated to its XML counterpart, and back again, with one-to-one correspondence in structure and meaning, in much the same way that Simple Outline XML (SOX) relates to XML. It shares many features with the syntax of DTDs. Here is the compact form of the above schema:

element book {
    element page { text }+
}

With named patterns, this can be flattened to:

start = element book { page+ }
page = element page { text }

A compact RELAX NG parser will treat these two as the same pattern.

Comparison with W3C XML Schema

Although the RELAX NG specification was developed at roughly the same time as the W3C XML Schema specification, the latter was arguably better known and more widely implemented in both open-source and proprietary XML parsers and editors when it became a W3C Recommendation in 2001. Since then, however, RELAX NG support has increasingly found its way into XML software, and its acceptance has been aided by its adoption as a primary schema for popular document-centric markup languages such as DocBook, the TEI Guidelines, OpenDocument, and EPUB.

RELAX NG shares with W3C XML Schema many features that set both apart from traditional DTDs: data typing, regular expression support, namespace support, ability to reference complex definitions.

Filename extensions

By informal convention, RELAX NG schemas in the regular syntax are typically named with the filename extension ".rng". For schemas in the compact syntax, the extension ".rnc" is used.

Determinism

Relax NG schemas are not necessarily "deterministic" or "unambiguous".

Converting Relax NG to DTD

Relax NG schemas can be converted to DTDs by applying Trang which can be found at: [1]. The manual for Trang is located at [2]. Note that Trang is unable to convert the OASIS DITA 1.3 schema to DTDs, failing with messages like:

 sorry, combining definitions with combine="choice" is not supported

See also

References

  1. ^ RELAX NG Compact Syntax
  2. ^ James Clark. "TREX - Tree Regular Expressions for XML - "TREX has been merged with RELAX to create RELAX NG."". Retrieved 2009-12-28.
  3. ^ Murata Makoto (2002-04-03). "RELAX (Regular Language description for XML) -- "RELAX NG of OASIS. It is a schema language created by unifying RELAX Core and TREX."". Retrieved 2009-12-28.
  4. ^ "TREX and RELAX Unified as RELAX NG, a Lightweight XML Language Validation Specification". Cover Pages. 2001-06-05. Retrieved 2009-12-28.
  5. ^ RELAX NG Specification
  6. ^ RELAX NG Technical Committee
  7. ^ ISO. "ISO/IEC 19757-2:2003 - Information technology -- Document Schema Definition Language (DSDL) -- Part 2: Regular-grammar-based validation -- RELAX NG". ISO. Retrieved 2009-12-28.

External links

Comparison of layout engines (XML)

The following tables compare XML compatibility and support for a number of layout engines.

DocBook

DocBook is a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software but it can be used for any other sort of documentation.As a semantic language, DocBook enables its users to create document content in a presentation-neutral form that captures the logical structure of the content; that content can then be published in a variety of formats, including HTML, XHTML, EPUB, PDF, man pages, Web help and HTML Help, without requiring users to make any changes to the source. In other words, when a document is written in DocBook format it becomes easily portable into other formats. It solves the problem of reformatting by writing it once using XML tags.

Document Definition Markup Language

Document Definition Markup Language (DDML) is an XML schema language proposed in 1999 by various contributors from the xml-dev electronic mailing list. It was published only as a W3C Note, not a Recommendation, and never found favor with developers.

DDML began as XSchema, a reformulation of XML DTDs as full XML documents, so that elements and attributes, rather than declarations, could be used to describe a schema. As development continued, the name was changed to DDML, reflecting a shift away from the goal of replicating all DTD functionality, in order to concentrate on providing a robust framework for describing basic element/attribute hierarchy. DDML offered no datatypes or functionality beyond what DTDs already provided, so there was not much advantage to using DDML instead of DTDs. DDML did, however, inform the development of the next generation of XML-based schema languages, including the more successful XML Schema and RELAX NG.

Document Schema Definition Languages

Document Schema Definition Languages (DSDL) is a framework within which multiple validation tasks of different types can be applied to an XML document in order to achieve more complete validation results than just the application of a single technology.

It is specified as a multi-part ISO/IEC Standard, ISO/IEC 19757. It was developed by ISO/IEC JTC1/SC34 (ISO/IEC Joint Technical Committee 1, Subcommittee 34 - Document description and processing languages).DSDL defines a modular set of specifications for describing the document structures, data types, and data relationships in structured information resources.

Part 2: Regular-grammar-based validation – RELAX NG

Part 3: Rule-based validation – Schematron

Part 4: Namespace-based Validation Dispatching Language (NVDL)

Part 5: Extensible Datatypes

Part 7: Character Repertoire Description Language (CREPDL)

Part 8: Document Semantics Renaming Language (DSRL)

Part 9: Namespace and datatype declaration in Document Type Definitions (DTDs) (Datatype- and namespace-aware DTDs)

Part 11: Schema Association

Document type definition

A document type definition (DTD) is a set of markup declarations that define a document type for a SGML-family markup language (SGML, XML, HTML).

A DTD defines the valid building blocks of an XML document. It defines the document structure with a list of validated elements and attributes. A DTD can be declared inline inside an XML document, or as an external reference.XML uses a subset of SGML DTD.

As of 2009, newer XML namespace-aware schema languages (such as W3C XML Schema and ISO RELAX NG) have largely superseded DTDs. A namespace-aware version of DTDs is being developed as Part 9 of ISO DSDL. DTDs persist in applications that need special publishing characters, such as the XML and HTML Character Entity References, which derive from larger sets defined as part of the ISO SGML standard effort.

Encoded Archival Context

Encoded Archival Context - Corporate bodies, Persons and Families (EAC-CPF) is an XML standard for encoding information about the creators of archival materials -- i.e., a corporate body, person or family -- including their relationships to (a) resources (books, collections, papers, etc.) and (b) other corporate bodies, persons and families. The goal is to provide contextual information regarding the circumstances of record creation and use. EAC-CPF can be used in conjunction with Encoded Archival Description (EAD) for enhancement of EAD's capabilities in encoding finding aids, but can also be used in conjunction with other standards or for standalone authority file encoding.

EAC-CPF is defined in a document type definition (DTD) as well as in an XML Schema and a Relax NG schema. EAC-CPF elements reflect the ISAAR(CPF) standard and the ISAD(G), two standards managed by the International Council on Archives.

EAC-CPF has been and is being tested in various institutions, such as the European Union LEAF project; 'Linking and Exploring Authority Files', funded between 2001 and 2004.The Ad Hoc EAC-CPF Working Group's early drafts were published in 2004. and the working group released a draft for public comment in August 2009. prior to publication of the completed standard in 2010.

James Clark (programmer)

James Clark (23 February 1964) is the author of groff and expat, and has done much work with open-source software and XML.

Born in London and educated at Charterhouse and Merton College, Oxford, Clark has lived in Bangkok, Thailand since 1995, and is now a permanent resident. He owns a company called Thai Open Source Software Center, which provides him a legal framework for his open-source activities.

For the GNU project, he wrote groff, as well as an XML editing mode for GNU Emacs.

Makoto Murata

Makoto Murata (村田 真, Murata Makoto, born 1960) is a Japanese computer scientist, Ph.D. in Engineering, and Project Professor at Keio University.

He participated in the W3C (World Wide Web Consortium) XML Working Group.

The Working Group designed XML1.0, a markup language specification.

Murata and James Clark designed RELAX NG, an XML schema language.

Murata is the convener of ISO/IEC JTC 1/SC 34 WG 4, responsible for Office Open XML maintenance.

Medieval Nordic Text Archive

Medieval Nordic Text Archive (Menota) is a network of leading Nordic archives, libraries and research departments working with medieval texts and manuscript facsimiles. The aim of Menota is to preserve and publish medieval texts in digital form and to adapt and develop encoding standards necessary for this work.

Menota was established in 2001 and at the time of writing (June 2015) it offers 20 texts with a total of approx. 1 million words. The texts are mostly rendered on the diplomatic level (i.e. following the manuscripts in most matters of orthography), while some also are rendered on a very close level, the facsimle level (rendering abbreviations as such and some allographic variation), and others also on a normalised level, in which the orthography corresponds to the one found in grammars and dictionaries and text series like Íslenzk fornrit.

In addition to the archive of texts, Menota also offers a handbook in XML text encoding, The Menota handbook. This is based on the Guidelines of the Text Encoding Initiative, and discusses a number of encoding questions relating to vernacular manuscripts. The handbook is published digitally on the Menota site, and it offers a full TEI-style Document Type Definition and a Relax NG schema for anyone who wants to encode Medieval Nordic manuscripts.

Menota welcomes transcriptions of all kinds of Medieval Nordic primary sources, i.e. directly from the manuscript itself or a good facsimile of it, as long as the transcription has been proofread to an acceptable level and it is delivered in a valid XML file according to the schema available on the Menota site.

Menota follows the recommendations of the Medieval Unicode Font Initiative with respect to the encoding and display of special characters. On the normalised level of text rendering, all necessary characters will be found in the official part of the Unicode Standard, but some characters on a diplomatic level and several on a facsimile level can only be displayed by using characters in the Private Use Area of Unicode. MUFI offers several free or low-cost fonts for this use.

Office Open XML

Office Open XML (also informally known as OOXML or Microsoft Open XML (MOX)) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. The format was initially standardized by Ecma (as ECMA-376), and by the ISO and IEC (as ISO/IEC 29500) in later versions.

Microsoft Office 2010 provides read support for ECMA-376, read/write support for ISO/IEC 29500 Transitional, and read support for ISO/IEC 29500 Strict. Microsoft Office 2013 and Microsoft Office 2016 additionally support both reading and writing of ISO/IEC 29500 Strict. While Office 2013 and onward have full read/write support for ISO/IEC 29500 Strict, Microsoft has not yet implemented the strict non-transitional, or original standard, as the default file format yet due to remaining interoperability concerns.

Oxygen XML Editor

The Oxygen XML Editor (styled ) is a multi-platform XML editor, XSLT/XQuery debugger and profiler with Unicode support. It is a Java application, so it can run in Windows, Mac OS X, and Linux. It also has a version that can run as an Eclipse plugin.

RNV

RNV may refer to:

Armavia, airline company in 1996–2013, operating as Armenia's flag carrier (ICAO airline designator)

Radio Nacional de Venezuela, a government radio station in Venezuela that began broadcasting in 1936

Relax NG Validator, RNV is an implementation of Relax NG Compact Syntax validator in ANSI C. The command-line utility uses Expat. It is distributed under BSD license[1].

Regular Language description for XML

REgular LAnguage description for XML (RELAX) is a specification for describing XML-based languages.

A description written in RELAX is called a RELAX grammar.

RELAX Core has been approved as an ISO/IEC Technical Report 22250-1 in 2002 (ISO/IEC TR 22250-1:2002). It was developed by ISO/IEC JTC1/SC34 (ISO/IEC Joint Technical Committee 1, Subcommittee 34 - Document description and processing languages).RELAX was designed by Murata Makoto.

In 2001, an XML schema language RELAX NG was created by unifying of RELAX Core and James Clark's TREX. It was published as ISO/IEC 19757-2 in 2003.

Text Encoding Initiative

The Text Encoding Initiative (TEI) is a text-centric community of practice in the academic field of digital humanities, operating continuously since the 1980s. The community currently runs a mailing list, meetings and conference series, and maintains an eponymous technical standard, a journal, a wiki, a GitHub repository and a toolchain.

XHTML Modularization

XHTML modularization is a methodology for producing modularized markup languages in a number of different schema languages (currently DTDs, XML Schema and Relax NG) so that the modules can easily be plugged together to create markup languages.Although it was originally designed to help manage the development of various XHTML Profiles, such as XHTML 1.1, XHTML Basic for mobile devices, and XHTML Print for sending to printers,

the methodology is independent of XHTML, and has been used for the definition of other markup languages as well, such as SVG and MathML.

XML

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The W3C's XML 1.0 Specification and several other related specifications—all of them free open standards—define XML.The design goals of XML emphasize simplicity, generality, and usability across the Internet. It is a textual data format with strong support via Unicode for different human languages. Although the design of XML focuses on documents, the language is widely used for the representation of arbitrary data structures such as those used in web services.

Several schema systems exist to aid in the definition of XML-based languages, while programmers have developed many application programming interfaces (APIs) to aid the processing of XML data.

XML Information Set

XML Information Set (XML Infoset) is a W3C specification describing an abstract data model of an XML document in terms of a set of information items. The definitions in the XML Information Set specification are meant to be used in other specifications that need to refer to the information in a well-formed XML document.

An XML document has an information set if it is well-formed and satisfies the namespace constraints. There is no requirement for an XML document to be valid in order to have an information set.

An information set can contain up to eleven different types of information items:

The Document Information Item (always present)

Element Information Items

Attribute Information Items

Processing Instruction Information Items

Unexpanded Entity Reference Information Items

Character Information Items

Comment Information Items

The Document Type Declaration Information Item

Unparsed Entity Information Items

Notation Information Items

Namespace Information ItemsXML was initially developed without a formal definition of its infoset. This was only formalised by later work beginning in 1999, first published as a separate W3C Working Draft at the end of December that year.

Infoset recommendation Second Edition was adopted on 4 February, 2004. If a 2.0 version of the XML standard is ever published, it is likely that this would absorb the Infoset recommendation as an integral part of that standard.

XML schema

An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constraints are generally expressed using some combination of grammatical rules governing the order of elements, Boolean predicates that the content must satisfy, data types governing the content of elements and attributes, and more specialized rules such as uniqueness and referential integrity constraints.

There are languages developed specifically to express XML schemas. The document type definition (DTD) language, which is native to the XML specification, is a schema language that is of relatively limited capability, but that also has other uses in XML aside from the expression of schemas. Two more expressive XML schema languages in widespread use are XML Schema (with a capital S) and RELAX NG.

The mechanism for associating an XML document with a schema varies according to the schema language. The association may be achieved via markup within the XML document itself, or via some external means.

ISO standards by standard number
1–9999
10000–19999
20000+
IEC standards
ISO/IEC standards
Related

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.