Topic map

A topic map is a standard for the representation and interchange of knowledge, with an emphasis on the findability of information. Topic maps were originally developed in the late 1990s as a way to represent back-of-the-book index structures so that multiple indexes from different sources could be merged. However, the developers quickly realized that with a little additional generalization, they could create a meta-model with potentially far wider application. The ISO standard is formally known as ISO/IEC 13250:2003.

A topic map represents information using

  • topics, representing any concept, from people, countries, and organizations to software modules, individual files, and events,
  • associations, representing hypergraph relationships between topics, and
  • occurrences, representing information resources relevant to a particular topic.

Topic maps are similar to concept maps and mind maps in many respects, though only topic maps are ISO standards. Topic maps are a form of semantic web technology similar to RDF.


Ontology and merging

Topics, associations, and occurrences can all be typed, where the types must be defined by the one or more creators of the topic map(s). The definitions of allowed types is known as the ontology of the topic map.

Topic maps explicitly support the concept of merging of identity between multiple topics or topic maps. Furthermore, because ontologies are topic maps themselves, they can also be merged thus allowing for the automated integration of information from diverse sources into a coherent new topic map. Features such as subject identifiers (URIs given to topics) and PSIs (published subject indicators) are used to control merging between differing taxonomies. Scoping on names provides a way to organise the various names given to a particular topic by different sources.

Current standard

The work standardizing topic maps (ISO/IEC 13250) took place under the umbrella of the ISO/IEC JTC1/SC34/WG3 committee (ISO/IEC Joint Technical Committee 1, Subcommittee 34, Working Group 3 – Document description and processing languages – Information Association).[1][2][3] However, WG3 was disbanded and maintenance of ISO/IEC 13250 was assigned to WG8.

The topic maps (ISO/IEC 13250) reference model and data model standards are defined independent of any specific serialization or syntax.

  • TMRM Topic Maps – Reference Model
  • TMDM Topic Maps – Data Model

Graphical notation

Topic Maps Martian Notation (example)

There are different ways of notating topic maps graphically, in addition to GTM, listed below. One recently developed example is Topic Maps Martian Notation. TMMN (its acronym) is a simple graphical notation used to explain the Topic Maps data model, and map out both ontologies and representative instance data.[4] It is designed for use on whiteboard or paper, as well as within any diagram-based software including everyday presentation tools such as PowerPoint and TMMN uses only a very small number of symbols – "blob", "label", "line", "dotted line", and "arrow" – to represent the relationships and basic elements of the topic maps model: topics, names, associations (and roles), scope, and occurrences (including subject identifiers and subject locators). The "Martian" refers to the archetypal "Martian Scientist", namely, the ability to communicate knowledge across linguistic and cultural barriers, known and unknown. It was developed as part of the musicDNA project.[5] Advanced Topic Maps Martian Notation is currently under development by the musicDNA community and includes shorthand notation for various types of whole-part relationships.

Data format

The specification is summarized in the abstract as follows: "This specification provides a model and grammar for representing the structure of information resources used to define topics, and the associations (relationships) between topics. Names, resources, and relationships are said to be characteristics of abstract subjects, which are called topics. Topics have their characteristics within scopes: i.e. the limited contexts within which the names and resources are regarded as their name, resource, and relationship characteristics. One or more interrelated documents employing this grammar is called a topic map."

XML serialization formats

  • In 2000, Topic Maps was defined in an XML syntax XTM. This is now commonly known as "XTM 1.0" and is still in fairly common use.
  • The ISO standards committee published an updated XML syntax in 2006, XTM 2.0 which is increasingly in use today.

Note that XTM 1.0 predates and therefore is not compatible with the more recent versions of the (ISO/IEC 13250) standard.

Other formats

Other proposed or standardized serialization formats include:

  • CXTM Canonical XML Topic Maps format (canonicalization of topic maps)
  • CTM – a Compact Topic Maps Notation (not based on XML)
  • GTM – a Graphical Topic Maps Notation

The above standards are all recently proposed or defined as part of ISO/IEC 13250. As described below, there are also other, serialization formats such as LTM, AsTMa= that have not been put forward as standards.

Linear topic map notation (LTM) serves as a kind of shorthand for writing topic maps in plain text editors. This is useful for writing short personal topic maps or exchanging partial topic maps by email. The format can be converted to XTM.

There is another format called AsTMa which serves a similar purpose. When writing topic maps manually it is much more compact, but of course can be converted to XTM. Alternatively, it can be used directly with the Perl Module TM (which also supports LTM).

The data formats of XTM and LTM are similar to the W3C standards for RDF/XML or the older N3 notation.[6]

Related standards

Topic Maps API

A de facto API standard called Common Topic Maps Application Programming Interface (TMAPI) was published in April 2004 and is supported by many Topic Maps implementations or vendors:

  • TMAPI – Common Topic Maps Application Programming Interface
  • TMAPI 2.0 Topic Maps Application Programming Interface (v2.0)

Query standard

In normal use it is often desirable to have a way to arbitrarily query the data within a particular Topic Maps store. Many implementations provide a syntax by which this can be achieved (somewhat like 'SQL for Topic Maps') but the syntax tends to vary a lot between different implementations. With this in mind, work has gone into defining a standardized syntax for querying topic maps:

Constraint standards

It can also be desirable to define a set of constraints that can be used to guarantee or check the semantic validity of topic maps data for a particular domain. (Somewhat like database constraints for topic maps). Constraints can be used to define things like 'every document needs an author' or 'all managers must be human'. There are often implementation specific ways of achieving these goals, but work has gone into defining a standardized constraint language as follows:

TMCL is functionally similar to RDF Schema with Web Ontology Language (OWL).[6]

Earlier standards

The "Topic Maps" concept has existed for a long time. The HyTime standard was proposed as far back as 1992 (or earlier?). Earlier versions of ISO 13250 (than the current revision) also exist. More information about such standards can be found at the ISO Topic Maps site.

RDF relationship

Some work has been undertaken to provide interoperability between the W3C's RDF/OWL/SPARQL family of semantic web standards and the ISO's family of Topic Maps standards though the two have slightly different goals.

The semantic expressive power of Topic Maps is, in many ways, equivalent to that of RDF, but the major differences are that Topic Maps (i) provide a higher level of semantic abstraction (providing a template of topics, associations and occurrences, while RDF only provides a template of two arguments linked by one relationship) and (hence) (ii) allow n-ary relationships (hypergraphs) between any number of nodes, while RDF is limited to triplets.

See also


  1. ^ ISO JTC1/SC34. "JTC 1/SC 34 – Document Description and Processing Languages". Archived from the original on 6 May 2014. Retrieved 25 December 2009.
  2. ^ "Home of SC34/WG3 – Information Association". 3 June 2008. Retrieved 2009-12-26.
  3. ^ ISO. "JTC 1/SC 34 – Document description and processing languages". ISO. Retrieved 2009-12-25.
  4. ^ Topic Maps Martian Notation tutorial
  5. ^ musicDNA
  6. ^ a b Lars Marius Garshol (2003). "Living With Topic Maps and RDF". Retrieved 2014-02-21.

Further reading

  • Lutz Maicher and Jack Park: Charting the Topic Maps Research and Applications Landscape, Springer, ISBN 3-540-32527-1
  • Jack Park and Sam Hunting: XML Topic Maps: Creating and Using Topic Maps for the Web, Addison-Wesley, ISBN 0-201-74960-2 (in bibMap)
  • Passin, Thomas B. (2004). Explorer's Guide to the Semantic Web. Manning Publications. ISBN 1-932394-20-6.

External links

Acct (protocol)

The acct URI scheme is a proposed internet standard published by the Internet Engineering Task Force, defined by RFC 7565. The purpose of the scheme is to identify, rather than interact, with user accounts hosted by a service provider. This scheme differs from the DNS name which specifies the service provider.The acct URI was intended to be the single URI scheme that would return information about a person (or possibly a thing) that holds an account at a given domain.

Concept map

A concept map or conceptual diagram is a diagram that depicts suggested relationships between concepts. It is a graphical tool that instructional designers, engineers, technical writers, and others use to organize and structure knowledge.

A concept map typically represents ideas and information as boxes or circles, which it connects with labeled arrows in a downward-branching hierarchical structure. The relationship between concepts can be articulated in linking phrases such as causes, requires, or contributes to.The technique for visualizing these relationships among different concepts is called concept mapping. Concept maps have been used to define the ontology of computer systems, for example with the object-role modeling or Unified Modeling Language formalism.

Conceptual graph

A conceptual graph (CG) is a formalism for knowledge representation. In the first published paper on CGs, John F. Sowa (Sowa 1976) used them to represent the conceptual schemas used in database systems. The first book on CGs (Sowa 1984) applied them to a wide range of topics in artificial intelligence, computer science, and cognitive science.

Decision tree

A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains conditional control statements.

Decision trees are commonly used in operations research, specifically in decision analysis, to help identify a strategy most likely to reach a goal, but are also a popular tool in machine learning.

Diagrammatic reasoning

Diagrammatic reasoning is reasoning by means of visual representations. The study of diagrammatic reasoning is about the understanding of concepts and ideas, visualized with the use of diagrams and imagery instead of by linguistic or algebraic means.

Graphic communication

Graphic communication as the name suggests is communication using graphic elements. These elements include symbols such as glyphs and icons, images such as drawings and photographs, and can include the passive contributions of substrate, color and surroundings. It is the process of creating, producing, and distributing material incorporating words and images to convey data, concepts, and emotions.The field of graphic communications encompasses all phases of the graphic communications processes from origination of the idea (design, layout, and typography) through reproduction, finishing and distribution of two- or three-dimensional products or electronic transmission.

Internationalized Resource Identifier

The Internationalized Resource Identifier (IRI) – is an internet protocol standard which extends the ASCII characters subset of the Uniform Resource Identifier (URI) protocol. It was defined by the Internet Engineering Task Force (IETF) in 2005 as a new internet standard to extend the existing URI scheme. The primary standard is defined by the RFC 3987. While URIs are limited to a subset of the ASCII character set, IRIs may contain characters from the Universal Character Set (Unicode/ISO 10646), including Chinese or Japanese kanji, Korean, Cyrillic characters, and so forth.

Issue tree

An issue tree, also called logic tree, is a graphical breakdown of a question that dissects it into its different components vertically and that progresses into details as it reads to the right.Issue trees are useful in problem solving to identify the root causes of a problem as well as to identify its potential solutions. They also provide a reference point to see how each piece fits into the whole picture of a problem.According to professor of strategy Arnaud Chevallier, there are two types of issue trees: diagnostic ones and solution ones.Diagnostic trees break down a "why" key question, identifying all the possible root causes for the problem.

Solution trees break down a "how" key question, identifying all the possible alternatives to fix the problem.

Four basic rules can help ensure that issue trees are optimal, according to Chevallier:

Consistently answer a "why" or a "how" question

Progress from the key question to the analysis as it moves to the right

Have branches that are mutually exclusive and collectively exhaustive (MECE)

Use an insightful breakdownThe requirement for issue trees to be collectively exhaustive implies that divergent thinking is a critical skill.

Mental model

A mental model is an explanation of someone's thought process about how something works in the real world. It is a representation of the surrounding world, the relationships between its various parts and a person's intuitive perception about his or her own acts and their consequences. Mental models can help shape behaviour and set an approach to solving problems (similar to a personal algorithm) and doing tasks.

A mental model is a kind of internal symbol or representation of external reality, hypothesized to play a major role in cognition, reasoning and decision-making. Kenneth Craik suggested in 1943 that the mind constructs "small-scale models" of reality that it uses to anticipate events.

Jay Wright Forrester defined general mental models as:

The image of the world around us, which we carry in our head, is just a model. Nobody in his head imagines all the world, government or country. He has only selected concepts, and relationships between them, and uses those to represent the real system (Forrester, 1971).

In psychology, the term mental models is sometimes used to refer to mental representations or mental simulation generally. At other times it is used to refer to § Mental models and reasoning and to the mental model theory of reasoning developed by Philip Johnson-Laird and Ruth M.J. Byrne.

Mind map

A mind map is a diagram used to visually organize information. A mind map is hierarchical and shows relationships among pieces of the whole. It is often created around a single concept, drawn as an image in the center of a blank page, to which associated representations of ideas such as images, words and parts of words are added. Major ideas are connected directly to the central concept, and other ideas branch out from those major ideas.

Mind maps can also be drawn by hand, either as "rough notes" during a lecture, meeting or planning session, for example, or as higher quality pictures when more time is available. Mind maps are considered to be a type of spider diagram. A similar concept in the 1970s was "idea sun bursting".

Morphological analysis (problem-solving)

Morphological analysis or general morphological analysis is a method developed by Fritz Zwicky (1967, 1969) for exploring all the possible solutions to a multi-dimensional, non-quantified complex problem.

Outline (list)

An outline, also called a hierarchical outline, is a list arranged to show hierarchical relationships and is a type of tree structure. An outline is used to present the main points (in sentences) or topics (terms) of a given subject. Each item in an outline may be divided into additional sub-items. If an organizational level in an outline is to be sub-divided, it shall have at least two subcategories, as advised by major style manuals in current use. An outline may be used as a drafting tool of a document, or as a summary of the content of a document or of the knowledge in an entire field. It is not to be confused with the general context of the term "outline", which a summary or overview of a subject, presented verbally or written in prose (for example, The Outline of History is not an outline of the type presented below). The outlines described in this article are lists, and come in several varieties.

A sentence outline is a tool for composing a document, such as an essay, a paper, a book, or even an encyclopedia. It is a list used to organize the facts or points to be covered, and their order of presentation, by section. Topic outlines list the subtopics of a subject, arranged in levels, and while they can be used to plan a composition, they are most often used as a summary, such as in the form of a table of contents or the topic list in a college course's syllabus.

Outlines are further differentiated by the index prefixing used, or lack thereof. Many outlines include a numerical or alphanumerical prefix preceding each entry in the outline, to provide a specific path for each item, to aid in referring to and discussing the entries listed. An alphanumerical outline uses alternating letters and numbers to identify entries. A decimal outline uses only numbers as prefixes. An outline without prefixes is called a "bare outline".

Specialized applications of outlines also exist. A reverse outline is a list of sentences or topics that is created from an existing work, as a revision tool; it may show the gaps in the document's coverage so that they may be filled, and may help in rearranging sentences or topics to improve the structure and flow of the work. An integrated outline is a composition tool for writing scholastic works, in which the sources, and the writer's notes from the sources, are integrated into the outline for ease of reference during the writing process.

A software program designed for processing outlines is called an outliner.

Personal knowledge base

A personal knowledge base (PKB) is an electronic tool used to express, capture, and later retrieve the personal knowledge of an individual. It differs from a traditional database in that it contains subjective material particular to the owner, that others may not agree with nor care about. Importantly, a PKB consists primarily of knowledge, rather than information; in other words, it is not a collection of documents or other sources an individual has encountered, but rather an expression of the distilled knowledge the owner has extracted from those sources.The term personal knowledge base was mentioned as early as the 1980s, but the term came to prominence when it was described at length in publications by computer scientist Stephen Davies and colleagues, who compared PKBs on a number of different dimensions, the most important of which is the data model that each PKB uses to organize knowledge.Davies and colleagues examined three aspects of the data models of PKBs: their structural framework, which prescribes rules about how knowledge elements can be structured and interrelated (as a tree, graph, tree plus graph, spatially, categorically, or as n-ary links); their knowledge elements, or basic building blocks of information that a user creates and works with, and the level of granularity of those knowledge elements (such as word/concept, phrase/proposition, free text notes, links to information sources, or composite); and their schema, which involves the level of formal semantics introduced into the data model (such as a type system and related schemas, keywords, attribute–value pairs, etc.). Davies and colleagues also differentiated PKBs according to their architecture: file-based, database-based, or client–server systems (including Internet-based systems accessed through desktop computers and/or handheld mobile devices).Non-electronic personal knowledge bases have probably existed in some form for centuries: Da Vinci's notebooks are a famous example. More commonly, files of index cards (in German, Zettelkasten) and edge-notched cards, and annotated private libraries, have served this function in the pre-electronic age. Undoubtedly the most famous early formulation of an electronic PKB was Vannevar Bush's description of the "memex" in 1945. In a 1962 technical report, human–computer interaction pioneer Douglas Engelbart (who would later become famous for his 1968 "Mother of All Demos" that demonstrated almost all the fundamental elements of modern personal computing) described his use of edge-notched cards to partially model Bush's memex.

Query language

Query languages or data query languages (DQLs) are computer languages used to make queries in databases and information systems.

RDF query language

An RDF query language is a computer language, specifically a query language for databases, able to retrieve and manipulate data stored in Resource Description Framework (RDF) format.

SPARQL has emerged as the standard RDF query language, and in 2008 became a W3C recommendation.

Social Semantic Web

The concept of the Social Semantic Web subsumes developments in which social interactions on the Web lead to the creation of explicit and semantically rich knowledge representations. The Social Semantic Web can be seen as a Web of collective knowledge systems, which are able to provide useful information based on human contributions and which get better as more people participate. The Social Semantic Web combines technologies, strategies and methodologies from the Semantic Web, social software and the Web 2.0.

Uniform Resource Identifier

A Uniform Resource Identifier (URI) is a string of characters that unambiguously identifies a particular resource. To guarantee uniformity, all URIs follow a predefined set of syntax rules, but also maintain extensibility through a separately defined hierarchical naming scheme (e.g. "http://").

Such identification enables interaction with representations of the resource over a network, typically the World Wide Web, using specific protocols. Schemes specifying a concrete syntax and associated protocols define each URI. The most common form of URI is the Uniform Resource Locator (URL), frequently referred to informally as a web address. More rarely seen in usage is the Uniform Resource Name (URN), which was designed to complement URLs by providing a mechanism for the identification of resources in particular namespaces.

Visual analytics

Visual analytics is an outgrowth of the fields of information visualization and scientific visualization that focuses on analytical reasoning facilitated by interactive visual interfaces.

Visual language

The visual language is a system of communication using visual elements. Speech as a means of communication cannot strictly be separated from the whole of human communicative activity which includes the visual and the term 'language' in relation to vision is an extension of its use to describe the perception, comprehension and production of visible signs.

IEC standards
ISO/IEC standards
ISO standards by standard number
Related topics

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.