ISO 10303-21

STEP-File is the most widely used data exchange form of STEP. ISO 10303 can represent 3D objects in Computer-aided design (CAD) and related information. Due to its ASCII structure, a STEP-file is easy to read, with typically one instance per line. The format of a STEP-File is defined in ISO 10303-21 Clear Text Encoding of the Exchange Structure.[1]

ISO 10303-21 defines the encoding mechanism for representing data conforming to a particular schema in the EXPRESS data modeling language specified in ISO 10303-11. A STEP-File is also called p21-File and STEP Physical File. The file extensions .stp and .step indicate that the file contains data conforming to STEP Application Protocols while the extension .p21 should be used for all other purposes.[2]

Filename extension.step, .stp, .p21
Magic numberISO-10303-21
Developed byISO
Initial release1994


Some details to take note of:

  • The first edition, ISO 10303-21:1994, had some bugs, which were corrected by a Technical Corrigendum. Therefore, it is recommended that users study the second edition instead (see below).
  • The second edition, ISO 10303-21:2002, included the corrigendum and extensions for several data sections.
  • The third edition, ISO 10303-21:2016, added anchor, reference and signature sections to support external references, support for compressed exchange structures in a ZIP-based archive, digital signatures, and UTF-8 character encoding.[3]
  • Part 21 defined two conformance classes. They differ only in how to encode complex entity instances.
    • Conformance class 1 is always used enforce the so-called internal mapping, which is more compact.
    • Conformance class 2, which is not used in practice, always enforces the external mapping. In theory this would allow better AP interoperability, since a postprocessor may know how to handle some supertypes, but may not know the specified subtypes.
  • The 1st edition of part 21 enforces the use of so-called SHORT NAMES, which are optional in the 2nd edition. In practice, however, SHORT NAMES are rarely used.
  • The 2nd edition allows multiple data sections to be used. In practice, however, most implementations only use a single data section (1st edition encoding).

ISO 10303-21 Building blocks


A typical example looks like this:

/* description */ ('A minimal AP214 example with a single part'),
/* implementation_level */ '2;1');
/* name */ 'demo',
/* time_stamp */ '2003-12-27T11:57:53',
/* author */ ('Lothar Klein'),
/* organization */ ('LKSoft'),
/* preprocessor_version */ ' ',
/* originating_system */ 'IDA-STEP',
/* authorization */ ' ');
FILE_SCHEMA (('AUTOMOTIVE_DESIGN { 1 0 10303 214 2 1 1}'));
#11=PRODUCT_DEFINITION_CONTEXT('part definition',#12,'manufacturing');
#12=APPLICATION_CONTEXT('mechanical design');
#16=PRODUCT('A0001','Test Part 1','',(#18));
#20=ORGANIZATION_ROLE('id owner');

HEADER section

As seen in the above example, the file is split into two sections following the initial keyword ISO-10303-21;:

The HEADER section has a fixed structure consisting of 3 to 6 groups in the given order. Except for the data fields time_stamp and FILE_SCHEMA all fields may contain empty strings.

    • description
    • implementation_level. The version and conformance option of this file. Possible versions are "1" for the original standard back in 1994, "2" for the technical corrigendum in 1995 and "3" for the second edition. The conformance option is either "1" for internal and "2" for external mapping of complex entity instances. Often, one will find here the value __'2;1'__. The value '2;2' enforcing external mapping is also possible but only very rarely used. The values '3;1' and '3;2' indicate extended STEP-Files as defined in the 2001 standard with several DATA sections, multiple schemas and FILE_POPULATION support.
    • name of this exchange structure. It may correspond to the name of the file in a file system or reflect data in this file. There is no strict rule how to use this field.
    • time_stamp indicates the time when this file was created. The time is given in the international data time format ISO 8601, e.g. 2003-12-27T11:57:53 for 27 of December 2003, 2 minutes to noon time.
    • author the name and mailing address of the person creating this exchange structure
    • organization the organization to whom the person belongs to
    • preprocessor_version the name of the system and its version which produces this STEP-file
    • originating_system the name of the system and its version which originally created the information contained in this STEP-file.
    • authorization the name and mailing address of the person who authorized this file.
  • FILE_SCHEMA. Specifies one or several Express schema governing the information in the data section(s). For first edition files, only one EXPRESS schema together with an optional ASN.1 object identifier of the schema version can be listed here. Second edition files may specify several EXPRESS schema.

The last three header groups are only valid in second edition files.

  • FILE_POPULATION, indicating a valid population (set of entity instances) which conforms to an EXPRESS schemas. This is done by collecting data from several data_sections and referenced instances from other data sections.
    • governing_schema, the EXPRESS schema to which the indicated population belongs to and by which it can be validated.
    • determination_method to figure out which instances belong to the population. Three methods are predefined: SECTION_BOUNDARY, INCLUDE_ALL_COMPATIBLE, and INCLUDE_REFERENCED.
    • governed_sections, the data sections whose entity instances fully belongs to the population.
    • The concept of FILE_POPULATION is very close to schema_instance of SDAI. Unfortunately, during the standardization process, it was not possible to come to an agreement to merge these concepts. Therefore, JSDAI adds further attributes to FILE_POPULATION as intelligent comments to cover all missing information from schema_instance. This is supported for both import and export.
  • SECTION_LANGUAGE allows assignment of a default language for either all or a specific data section. This is needed for those Express schemas that do not provide the capability to specify in which language string attributes of entities such as name and description are given.
  • SECTION_CONTEXT provide the capability to specify additional context information for all or single data sections. This can be used e.g. for STEP-APs to indicate which conformance class is covered by a particular data section.

DATA section

The DATA section contains application data according to one specific express schema. The encoding of this data follows some simple principles.

  • Instance name: Every entity instance in the exchange structure is given a unique name in the form "#1234". The instance name must consist of a positive number (>0) and is typically smaller than 263. The instance name is only valid locally within the STEP-file. If the same content is exported again from a system the instance names may be different for the same instances. The instance name is also used to reference other entity instances through attribute values or aggregate members. The referenced instance may be defined before or after the current instance.
  • Instances of single entity data types are represented by writing the name of the entity in capital letters and then followed by the attribute values in the defined order within parentheses. See e.g. "#16=PRODUCT(...)" above.
  • Instances of complex entity data types are represented in the STEP file by using either the internal mapping or the external mapping.
    • External mapping has always to be used if the complex entity instance consist of more than one leaf entity. In this case all the single entity instance values are given independently from each other in alphabetical order as defined above with all entity values grouped together in parentheses.
    • Internal mapping is used by default for conformance option 1 when the complex entity instance consists of only one leaf entity. The encoding is similar to the one of a single entity instance with the additional order given by the subtype definition.
  • Mapping of attribute values:
    • Only explicit attributes get mapped. Inverse, Derived and re-declared attributes are not listed since their values can be deduced from the other ones.
    • Unset attribute values are given as "$".
    • Explicit attributes which got re-declared as derived in a subtype are encoded as "*" in the position of the supertype attribute.
  • Mapping of other data types:
    • Enumeration, boolean and logical values are given in capital letters with a leading and trailing dot such as ".TRUE.".
    • String values are given in " ". For characters with a code greater than 126 a special encoding is used. The character sets as defined in ISO 8859 and 10646 are supported. Note that typical 8 (e.g. west European) or 16 (Unicode) bit character sets cannot directly be taken for STEP-file strings. They have to be decoded in a very special way.
    • Integers and real values are used identical to typical programming languages
    • Binary values (bit sequences) are encoded as hexadecimal and surrounded by double quotes, with a leading character indicating the number of unused bits (0, 1, 2, or 3) followed by uppercase hexadecimal encoding of data. It is important to note that the entire binary value is encoded as a single hexadecimal number, with the highest order bits in the first hex character and the lowest order bits in the last one.
    • The elements of aggregates (SET, BAG, LIST, ARRAY) are given in parentheses, separated by ",".
    • Care has to be taken for select data types based on defined data types. Here the name of the defined data type gets mapped too.
  • See also "Mapping of Express to Java" for more details of this.

See also

External links


  1. ^ ISO 10303-21:2002 Industrial automation systems and integration -- Product data representation and exchange -- Part 21: Implementation methods: Clear text encoding of the exchange structure
  2. ^ ISO TC184/SC4 Secretary "Cumulative list of resolutions" Resolution 583 (Stuttgart, Germany, - June 2003) "Registration of SC4 MIME-Types", [1]
  3. ^ ISO 10303-21:2016. Industrial automation systems and integration -- Product data representation and exchange -- Part 21: Implementation methods: Clear text encoding of the exchange structure [2]
EXPRESS (data modeling language)

EXPRESS is a standard data modeling language for product data. EXPRESS is formalized in the ISO Standard for the Exchange of Product model STEP (ISO 10303), and standardized as ISO 10303-11.


Gellish is a formal language that is natural language independent, although its concepts have 'names' and definitions in various natural languages. Any natural language variant, such as Gellish Formal English is a controlled natural language. Information and knowledge can be expressed in such a way that it is computer-interpretable, as well as system-independent and natural language independent. Each natural language variant is a structured subset of that natural language and is suitable for information modeling and knowledge representation in that particular language. All expressions, concepts and individual things are represented in Gellish by (numeric) unique identifiers (Gellish UID's). This enables software to translate expressions from one formal natural language to any other formal natural language.

Gellish is a universal and extendable conceptual data modeling language. Because it includes domain-specific terminology and definitions, it is also a semantic data modelling language and the Gellish modeling methodology is a member of the family of semantic modeling methodologies.

Gellish started out as an engineering modeling language ("Generic Engineering Language", hence the name, "Gellish") and was subsequently developed into a language with general applications.

ISO 10303-22

ISO 10303-22 is a part of the implementation methods of STEP with the official title Standard data access interface or simply SDAI.

SDAI defines an abstract Application Programming Interface (API) to work on application data according to a given data models defined in EXPRESS. SDAI itself is defined independent of a particular programming language. Language bindings exist for

Part 23 - C++ language binding of the standard data access interface

Part 24 - C binding of the standard data access interface

Part 27 - Java binding to the standard data access interface with Internet/Intranet extensions

The development of language bindings for FORTRAN and the interface definition language (IDL) of CORBA were canceled.The original intent of SDAI and its bindings to programming languages was to achieve portability of software applications from one implementation to another. This was soon abandoned because there were only a few commercial implementations and they differed significantly in their detailed APIs. Today the term SDAI is sometimes used for all kinds of APIs supporting STEP, even if they only partially follow the strict functionality as defined in ISO 10303-22 and its implementation methods, or not at all. Part 35 of STEP (Abstract test methods for SDAI implementations) provides a formal way how to prove the conformance of an implementation with SDAI.

The main components of SDAI are:

SDAI dictionary schema, a meta level EXPRESS schema to describe EXPRESS schemas

Managing objects

SDAI session to control the whole SDAI environment for a single user/thread including optional transaction control

SDAI repository the physical (typically) container to store SDAI models and Schema instances, e.g. a database

SDAI model a subdivision of an SDAI repository, containing entity instance according to a particular EXPRESS schema

Schema instance a logical grouping of one or several SDAI models, making up a valid population according to a particular EXPRESS schema


to deal with the managing objects

to create, delete and modify application data (entity instance, attribute values, aggregates and their members)

to validate application data according to all the constraints and rules specified in EXPRESS

Industry Foundation Classes

The Industry Foundation Classes (IFC) data model is intended to describe building and construction industry data.

It is a platform neutral, open file format specification that is not controlled by a single vendor or group of vendors. It is an object-based file format with a data model developed by buildingSMART (formerly the International Alliance for Interoperability, IAI) to facilitate interoperability in the architecture, engineering and construction (AEC) industry, and is a commonly used collaboration format in Building information modeling (BIM) based projects. The IFC model specification is open and available. It is registered by ISO and is an official International Standard ISO 16739-1:2018.

Because of its focus on ease of interoperability between software platforms, the Danish government has made the use of IFC format(s) compulsory for publicly aided building projects. Also, the Finnish state-owned facility management company Senate Properties demands use of IFC compatible software and BIM in all their projects. Also the Norwegian Government, Health and Defense client organisations require use of IFC BIM in all projects as well as many municipalities, private clients, contractors and designers have integrated IFC BIM in their business.

Intermediate Data Format

Intermediate Data Format (IDF) files are used interoperate between electronic design automation (EDA) software and solid modeling mechanical computer-aided design (CAD) software.

The format was devised by David Kehmeier at the Mentor Graphics Corporation.The EMN File contains the PCB-Outline, the Position of the Parts, Positions of Holes and milling, keep out regions and keep in regions.

The EMP file contains the outline and height of the parts.

Some CAD software allows the use of a map file to load more detailed part models.

Section header

Section header may refer to:

Section (typography), the beginning of a new section in a document

Radical (Chinese characters)

Executable and Linkable Format#Section HeaderSee also:

ISO 10303-21#HEADER section

Value change dump#Header section

List of HTTP header fields

Wirth syntax notation

Wirth syntax notation (WSN) is a metasyntax, that is, a formal way to describe formal languages. Originally proposed by Niklaus Wirth in 1977 as an alternative to Backus–Naur form (BNF). It has several advantages over BNF in that it contains an explicit iteration construct, and it avoids the use of an explicit symbol for the empty string (such as or ε).WSN has been used in several international standards, starting with ISO 10303-21. It was also used to define the syntax of EXPRESS, the data modelling language of STEP.

ISO standards by standard number

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.