The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use the ISO/IEC 2022 system of specifying control and graphic characters. Most character encodings, in addition to representing printable characters, also have characters such as these that represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received.
The C0 set defines codes in the range 00HEX–1FHEX and the C1 set defines codes in the range 80HEX–9FHEX. The default C0 set was originally defined in ISO 646 (ASCII), while the default C1 set was originally defined in ECMA-48 (harmonized later with ISO 6429). While other C0 and C1 sets are available for specialized applications, they are rarely used.
ASCII defined 32 control characters, plus a necessary extra one for the all-1 DEL character (needed to punch out all the holes on a paper tape and erase it).
This large number of codes was desirable at the time, as multi-byte controls would require implementation of a state machine in the terminal, which was very difficult with contemporary electronics and mechanical terminals. Since then, only a few of the original controls have maintained their use (the "whitespace" range of BS, TAB, LF, VT, FF, and CR). Others are unused or have acquired different meanings such as NUL being the C string terminator.
ESC is often used but as part of an ESC,'[' CSI pair (see C1 controls). Some transmission protocols such as ANPA-1312 do make extensive use of control characters SOH, STX, ETX and EOT. Other well known but now nearly obsolete ones are BEL, ACK, NAK and SYN.
Modern terminals have a vast number of "controls" accessible using multi-byte ANSI escape sequences starting with ESC and '['.
At the time the 8-bit ISO/IEC 8859 ASCII extensions were being designed it was considered important that stripping the top bit would not turn a printing character into a control (apparently DEL was considered harmless). Therefore these standards reserved the same 32 codes as the C0 set but with the high bit set for additional "C1" controls. Many of these were assigned meanings, mostly as new pairs of controls to replace C0 controls whose meaning had become ambiguous. In reality this "hole" in the printing characters probably caused more problems than it solved.
The standard also specified that all the C1 controls had a 7 bit equivalent consisting of ESC followed by a letter so that they could be achieved with 7 bit communication.
Except for NEL these are almost never used (CSI is often used, but almost always by using the ESC,'[' 7-bit replacement). The C1 characters require 2 bytes to be encoded in UTF-8 (for instance CSI at U+009B is encoded as the bytes 0xC2, 0x9B in UTF-8). Thus the corresponding control functions are more commonly accessed using the equivalent two byte escape sequence intended for use with systems that have only 7 bit bytes.
When these codes turn up in modern documents, web pages, e-mail messages, etc., which are ostensibly in an ISO-8859-n encoding, their code positions generally refer instead to the characters at that position in a proprietary, system-specific encoding such as Windows-1252 or the Apple Macintosh (Mac OS Roman) character set that use the C1 codes to instead provide additional graphic characters.
The official English language names of some C1 codes were revised in the most recent edition of the standard for control codes in general (ISO 6429:1992 or ECMA-48:1991) to be neutral with respect to the graphic characters used with them, and to not assume that, as in the Latin script, lines are written on a page from top to bottom and that characters are written on a line from left to right. The abbreviations used were not changed, as the standard had already specified that those would remain unchanged when the standard is translated to other languages. Where the name has been changed, the original name from which the abbreviation was derived is also given in small type in the tables below.
Unicode sets aside 65 code points for compatibility with ISO/IEC 2022. The Unicode control characters cover U+0000—U+001F (C0 controls), U+007F (delete), and U+0080—U+009F (C1 controls). Unicode only specifies semantics for U+001C—U+001F, U+0009—U+000D, and U+0085. The rest of the control characters are transparent to Unicode and their meanings are left to higher-level protocols.
Unicode has no code points allocated for any controls other than the C0 and C1 ones.
These are the standard ASCII control codes, originally defined in ANSI X3.4. If using the ISO/IEC 2022 extension mechanism, they are designated as the active C0 control character set with the octet sequence
0x1B 0x21 0x40 (
ESC ! @).
||Originally used to allow gaps to be left on paper tape for edits. Later used for padding after a code that might take a terminal some time to process (e.g. a carriage return or line feed on a printing terminal). Now often used as a string terminator, especially in the programming language C.|
||01||01||SOH||␁||Start of Heading||First character of a message header. In Hadoop, it is often used as a field separator.|
||02||02||STX||␂||Start of Text||First character of message text, and may be used to terminate the message heading.|
||03||03||ETX||␃||End of Text||Often used as a "break" character (Ctrl-C) to interrupt or terminate a program or process.|
||04||04||EOT||␄||End of Transmission||Often used on Unix to indicate end-of-file on a terminal.|
||05||05||ENQ||␅||Enquiry||Signal intended to trigger a response at the receiving end, to see if it is still present.|
||06||06||ACK||␆||Acknowledge||Response to an ENQ, or an indication of successful receipt of a message.|
||Originally used to sound a bell on the terminal. Later used for a beep on systems that didn't have a physical bell. May also quickly turn on and off inverse video (a visual bell).|
||Move the cursor one position leftwards. On input, this may delete the character to the left of the cursor. On output, where in early computer technology a character once printed could not be erased, the backspace was sometimes used to generate accented characters in ASCII. For example, à could be produced using the three character sequence |
||09||09||HT||␉||Character Tabulation, Horizontal Tabulation||
||Position to the next character tab stop.|
||On typewriters, printers, and some terminal emulators, moves the cursor down one row without affecting its column position. On Unix, used to mark end-of-line. In DOS, Windows, and various network standards, LF is used following CR as part of the end-of-line mark.|
||11||0B||VT||␋||Line Tabulation, Vertical Tabulation||
||Position the form at the next line tab stop.|
||On printers, load the next page. Treated as whitespace in many programming languages, and may be used to separate logical divisions in code. In some terminal emulators, it clears the screen. It still appears in some common plain text files as a page break character, such as the RFCs published by IETF.|
||Originally used to move the cursor to column zero while staying on the same line. On classic Mac OS (pre-Mac OS X), as well as in earlier systems such as the Apple II and Commodore 64, used to mark end-of-line. In DOS, Windows, and various network standards, it is used preceding LF as part of the end-of-line mark. The Enter or Return key on a keyboard will send this character, but it may be converted to a different end-of-line sequence by a terminal program.|
||14||0E||SO||␎||Shift Out||Switch to an alternative character set.|
||15||0F||SI||␏||Shift In||Return to regular character set after Shift Out.|
||16||10||DLE||␐||Data Link Escape||Cause the following octets to be interpreted as raw data, not as control codes or graphic characters. Returning to normal usage would be implementation dependent.|
||17||11||DC1||␑||Device Control One (XON)||These four control codes are reserved for device control, with the interpretation dependent upon the device to which they were connected. DC1 and DC2 were intended primarily to indicate activating a device while DC3 and DC4 were intended primarily to indicate pausing or turning off a device. DC1 and DC3 (known also as XON and XOFF respectively in this usage) originated as the "start and stop remote paper-tape-reader" functions in ASCII Telex networks. This teleprinter usage became the de facto standard for software flow control.|
||18||12||DC2||␒||Device Control Two|
||19||13||DC3||␓||Device Control Three (XOFF)|
||20||14||DC4||␔||Device Control Four|
||21||15||NAK||␕||Negative Acknowledge||Sent by a station as a negative response to the station with which the connection has been set up. In binary synchronous communication protocol, the NAK is used to indicate that an error was detected in the previously received block and that the receiver is ready to accept retransmission of that block. In multipoint systems, the NAK is used as the not-ready reply to a poll.|
||22||16||SYN||␖||Synchronous Idle||Used in synchronous transmission systems to provide a signal from which synchronous correction may be achieved between data terminal equipment, particularly when no other character is being transmitted.|
||23||17||ETB||␗||End of Transmission Block||Indicates the end of a transmission block of data when data are divided into such blocks for transmission purposes.|
||24||18||CAN||␘||Cancel||Indicates that the data preceding it are in error or are to be disregarded.|
||25||19||EM||␙||End of medium||Intended as means of indicating on paper or magnetic tapes that the end of the usable portion of the tape had been reached.|
||26||1A||SUB||␚||Substitute||Originally intended for use as a transmission control character to indicate that garbled or invalid characters had been received. It has often been put to use for other purposes when the in-band signaling of errors it provides is unneeded, especially where robust methods of error detection and correction are used, or where errors are expected to be rare enough to make using the character for other purposes advisable. In DOS, Windows and other CP/M derivatives, it is used to indicate the end of file, both when typing on the terminal, and sometimes in text files stored on disk.|
||27||1B||ESC||␛||Escape||\e[b]||The Esc key on the keyboard will cause this character to be sent on most systems. It can be used in software user interfaces to exit from a screen, menu, or mode, or in device-control protocols (e.g., printers and terminals) to signal that what follows is a special command sequence rather than normal text. In systems based on ISO/IEC 2022, even if another set of C0 control codes are used, this octet is required to always represent the escape character.|
||28||1C||FS||␜||File Separator||Can be used as delimiters to mark fields of data structures. If used for hierarchical levels, US is the lowest level (dividing plain-text data items), while RS, GS, and FS are of increasing level to divide groups made up of items of the level beneath it.|
|While not technically part of the C0 control character range, the following two characters are defined in ISO/IEC 2022 as always being available regardless of which sets of control characters and graphics characters have been registered. They can be thought of as having some characteristics of control characters.|
|32||20||SP||␠||Space||Space is a graphic character. It has a visual representation consisting of the absence of a graphic symbol. It causes the active position to be advanced by one character position. In some applications, Space can be considered a lowest-level "word separator" to be used with the adjacent separator characters.|
||127||7F||DEL||␡||Delete||Not technically part of the C0 control character range, this was originally used to mark deleted characters on paper tape, since any character could be changed to all ones by punching holes everywhere. On VT100 compatible terminals, this is the character generated by the key labelled ⌫, usually called backspace on modern machines, and does not correspond to the PC delete key.|
These are the most common extended control codes. If using the ISO/IEC 2022 extension mechanism, they are designated as the active C1 control character set with the sequence
0x1B 0x22 0x43 (
ESC " C). Individual control functions can be accessed with the 7-bit equivalents
0x1B 0x40 through
0x1B 0x5F (
ESC @ through
|@||128||80||PAD||Padding Character||Not part of ISO/IEC 6429 (ECMA-48). In early drafts of ISO 10646, was used as part of a proposed mechanism to encode non-ASCII characters. This use was removed in later drafts. Is nonetheless used by the internal-use two-byte fixed-length form of the ISO-2022-based Extended Unix Code (EUC) for left-padding single byte characters in code sets 1 and 3, whereas NUL serves the same function for code sets 0 and 2. This is not done in the usual "packed" EUC format.|
|A||129||81||HOP||High Octet Preset||Not part of ISO/IEC 6429 (ECMA-48). In early drafts of ISO 10646, was intended as a means of introducing a sequence of ISO 2022 compliant multiple byte characters with the same first byte without repeating said first byte, thus reducing length; this behaviour was never part of a standard or published implementation. Its name was nonetheless retained as a RFC 1345 standard code-point name.|
|B||130||82||BPH||Break Permitted Here||Follows a graphic character where a line break is permitted. Roughly equivalent to a soft hyphen except that the means for indicating a line break is not necessarily a hyphen. Not part of the first edition of ISO/IEC 6429. See also zero-width space.|
|C||131||83||NBH||No Break Here||Follows the graphic character that is not to be broken. Not part of the first edition of ISO/IEC 6429. See also word joiner.|
|D||132||84||IND||Index||Move the active position one line down, to eliminate ambiguity about the meaning of LF. Deprecated in 1988 and withdrawn in 1992 from ISO/IEC 6429 (1986 and 1991 respectively for ECMA-48).|
|E||133||85||NEL||Next Line||Equivalent to CR+LF. Used to mark end-of-line on some IBM mainframes.|
|F||134||86||SSA||Start of Selected Area||Used by block-oriented terminals.|
|G||135||87||ESA||End of Selected Area|
|H||136||88||HTS||Character Tabulation Set
Horizontal Tabulation Set
|Causes a character tabulation stop to be set at the active position.|
|I||137||89||HTJ||Character Tabulation With Justification
Horizontal Tabulation With Justification
|Similar to Character Tabulation, except that instead of spaces or lines being placed after the preceding characters until the next tab stop is reached, the spaces or lines are placed preceding the active field so that preceding graphic character is placed just before the next tab stop.|
|J||138||8A||VTS||Line Tabulation Set
Vertical Tabulation Set
|Causes a line tabulation stop to be set at the active position.|
|K||139||8B||PLD||Partial Line Forward
Partial Line Down
|Used to produce subscripts and superscripts in ISO/IEC 6429, e.g., in a printer.|
|L||140||8C||PLU||Partial Line Backward|
Partial Line Up
|M||141||8D||RI||Reverse Line Feed
|N||142||8E||SS2||Single-Shift 2||Next character invokes a graphic character from the G2 or G3 graphic sets respectively. In systems that conform to ISO/IEC 4873 (ECMA-43), even if a C1 set other than the default is used, these two octets may only be used for this purpose.|
|P||144||90||DCS||Device Control String||Followed by a string of printable characters (0x20 through 0x7E) and format effectors (0x08 through 0x0D), terminated by ST (0x9C).|
|Q||145||91||PU1||Private Use 1||Reserved for a function without standardized meaning for private use as required, subject to the prior agreement of the sender and the recipient of the data.|
|R||146||92||PU2||Private Use 2|
|S||147||93||STS||Set Transmit State|
|T||148||94||CCH||Cancel character||Destructive backspace, intended to eliminate ambiguity about meaning of BS.|
|V||150||96||SPA||Start of Protected Area||Used by block-oriented terminals.|
|W||151||97||EPA||End of Protected Area|
|X||152||98||SOS||Start of String||Followed by a control string terminated by ST (0x9C) that may contain any character except SOS or ST. Not part of the first edition of ISO/IEC 6429.|
|Y||153||99||SGCI||Single Graphic Character Introducer||Not part of ISO/IEC 6429. In early drafts of ISO 10646, was used to encode a single multiple-byte character without switching out of a HOP mode. In later drafts, this facility was removed, the name was nonetheless retained as a RFC 1345 standard code-point name.|
|Z||154||9A||SCI||Single Character Introducer||To be followed by a single printable character (0x20 through 0x7E) or format effector (0x08 through 0x0D). The intent was to provide a means by which a control function or a graphic character that would be available regardless of which graphic or control sets were in use could be defined. Definitions of what the following byte would invoke was never implemented in an international standard. Not part of the first edition of ISO/IEC 6429.|
|[||155||9B||CSI||Control Sequence Introducer||Used to introduce control sequences that take parameters.|
|]||157||9D||OSC||Operating System Command||Followed by a string of printable characters (0x20 through 0x7E) and format effectors (0x08 through 0x0D), terminated by ST (0x9C). These three control codes were intended for use to allow in-band signaling of protocol information, but are rarely used for that purpose.|
|_||159||9F||APC||Application Program Command|
C0 or C00 has several uses including:
C0, the IATA code for Centralwings airline
C0 and C1 control codes
a CPU power state in the Advanced Configuration and Power Interface
an alternate name for crt0, a library used in the startup of a C program
the differentiability class C0
a C0-semigroup, a strongly continuous one-parameter semigroup
c0, the Banach space of real sequences that converge to zero
a C0 field is an algebraically closed field
in physics, c0, the speed of light in a vacuum
%C0, the URL-encoded version of the character "À"
C0, a note-octave in music
an ISO 216 paper format size
C00, the ICD-10 code for oral cancerCancel character
In telecommunication, the term cancel character has the following meanings:
A control character ("CAN", U+0018, or ^X) used to indicate that the data with which it is associated are in error or are to be disregarded.
A control character ("CCH", U+0094) used to erase the previous character. This character was created as an unambiguous alternative to the much more common backspace character ("BS", U+0008), which has a now mostly obsolete alternative function of causing the following character to be superimposed on the preceding one.DLE
DLE may refer to:
An ISO 269 envelope size
Discoid lupus erythematosus, a chronic skin condition
Data Link Escape, one of the C0 and C1 control codes
DLE (company), a Japanese animation studio
The IATA code for Dole–Jura Airport, France
Dry Low Emission, an emission reduction technology used in gas turbinesISO-8859-8-I
ISO-8859-8-I is the IANA charset name for the character encoding ISO/IEC 8859-8 used together with the control codes from ISO/IEC 6429 for the C0 (00–1F hex) and C1 (80–9F) parts. The characters are in logical order.
Escape sequences (from ISO/IEC 6429 or ISO/IEC 2022) are not to be interpreted. Most applications only interpret the control codes for LF, CR, and HT. A few applications also interpret VT, FF, and NEL (in C1). Very few applications interpret the other C0 and C1 control codes.
ISO-8859-8 is sometimes in logical order (HTML, XML), and sometimes in visual (left-to-right) order (plain text without any markup).
Logical order for this charset requires bidi processing for display.ISO/IEC 8859-10
ISO/IEC 8859-10:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 10: Latin alphabet No. 6, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1992. It is informally referred to as Latin-6. It was designed to cover the Nordic languages, deemed of more use for them than ISO 8859-4.
ISO-8859-10 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. Microsoft has assigned code page 28600 a.k.a. Windows-28600 to ISO-8859-10 in Windows. IBM has assigned Code page 919 to ISO-8859-10.ISO/IEC 8859-13
ISO/IEC 8859-13:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 13: Latin alphabet No. 7, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1998. It is informally referred to as Latin-7 or Baltic Rim. It was designed to cover the Baltic languages, and added characters used in the Polish language missing from the earlier encodings ISO 8859-4 and ISO 8859-10. Unlike these two, it does not cover the Nordic languages.
ISO-8859-13 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429.
Microsoft has assigned code page 28603 a.k.a. Windows-28603 to ISO-8859-13. IBM has assigned Code page 921 to ISO-8859-13. ISO-IR 206 replaces the currency sign at position A4 with the Euro Sign (€).ISO/IEC 8859-14
ISO/IEC 8859-14:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 14: Latin alphabet No. 8 (Celtic), is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1998. It is informally referred to as Latin-8 or Celtic. It was designed to cover the Celtic languages, such as Irish, Manx, Scottish Gaelic, Welsh, Cornish, and Breton.
ISO-8859-14 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. CeltScript made an extension for Windows called Extended Latin-8. Microsoft has assigned code page 28604 a.k.a. Windows-28604 to ISO-8859-14.ISO/IEC 8859-15
ISO/IEC 8859-15:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 15: Latin alphabet No. 9, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1999. It is informally referred to as Latin-9 (and for a while Latin-0). It is similar to ISO 8859-1, and thus also intended for “Western European” languages, but replaces some less common symbols with the euro sign and some letters that were deemed necessary:
ISO-8859-15 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429.
Microsoft has assigned code page 28605 a.k.a. Windows-28605 to ISO-8859-15. IBM has assigned code page 923 to ISO 8859-15.
All the printable characters from both ISO/IEC 8859-1 and ISO/IEC 8859-15 are also found in Windows-1252. Since October 2016 0.1% of all web sites use ISO-8859-15.ISO/IEC 8859-16
ISO/IEC 8859-16:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 16: Latin alphabet No. 10, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin-10 or South-Eastern European. It was designed to cover Albanian, Croatian, Hungarian, Polish, Romanian, Serbian and Slovenian, but also French, German, Italian and Irish Gaelic (new orthography).
ISO-8859-16 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429.
Microsoft has assigned code page 28606 a.k.a. Windows-28606 to ISO-8859-16.ISO/IEC 8859-3
ISO/IEC 8859-3:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 3: Latin alphabet No. 3, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin-3 or South European. It was designed to cover Turkish, Maltese and Esperanto, though the introduction of ISO/IEC 8859-9 superseded it for Turkish. The encoding remains popular with users of Esperanto, though use is waning as application support for Unicode becomes more common.
ISO-8859-3 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. Microsoft has assigned code page 28593 a.k.a. Windows-28593 to ISO-8859-3 in Windows. IBM has assigned code page 913 to ISO 8859-3.ISO/IEC 8859-4
ISO/IEC 8859-4:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 4: Latin alphabet No. 4, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin-4 or North European. It was designed to cover Estonian, Latvian, Lithuanian, Greenlandic, and Sami. It has been largely superseded by ISO/IEC 8859-10 and Unicode. Microsoft has assigned code page 28594 a.k.a. Windows-28594 to ISO-8859-4 in Windows. IBM has assigned code page 914 to ISO 8859-4.
ISO-8859-4 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. ISO-IR 205 replaces the Currency Sign at 0xA4 with the Euro Sign.ISO/IEC 8859-5
ISO/IEC 8859-5:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 5: Latin/Cyrillic alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin/Cyrillic. It was designed to cover languages using a Cyrillic alphabet such as Bulgarian, Belarusian, Russian, Serbian and Macedonian but was never widely used. It would also have been usable for Ukrainian in the Soviet Union from 1933–1990, but it is missing the Ukrainian letter ge, ґ, which is required in Ukrainian orthography before and since, and during that period outside Soviet Ukraine. As a result, IBM created Code page 1124.
ISO-8859-5 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429.
The 8-bit encodings KOI8-R and KOI8-U, CP866, and also Windows-1251 are far more commonly used. Another possible way to represent Cyrillic is Unicode. In contrast to Windows-1252 and ISO 8859-1, Windows-1251 is not closely related to ISO 8859-5. The Windows code page for ISO-8859-5 is code page 28595 a.k.a. Windows-28595.ISO/IEC 8859-6
ISO/IEC 8859-6:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 6: Latin/Arabic alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as Latin/Arabic. It was designed to cover Arabic. Only nominal letters are encoded, no preshaped forms of the letters, so shaping processing is required for display. It does not include the extra letters needed to write most Arabic-script languages other than Arabic itself (such as Persian, Urdu, etc.).
ISO-8859-6 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. The text is in logical order, so BiDi processing is required for display. Nominally ISO-8859-6 (code page 28596) is for "visual order", and ISO-8859-6-I (code page 38596) is for logical order. But in practice, and required for HTML and XML documents, ISO-8859-6 also stands for logical order text. There is also ISO-8859-6-E which supposedly requires directionality to be explicitly specified with special control characters; this latter variant is in practice unused. IBM has assigned code page 1089 to ISO 8859-6. It is an emulation for their AIX operating system.
Unicode is preferred over ISO-8859-6 in modern applications, especially on the Internet; meaning the dominant UTF-8 encoding for web pages (see also Arabic script in Unicode, for complete coverage, unlike for e.g. ISO-8859-6 or Windows 1256 that don't cover extras). 0.1% of all web pages use ISO-8859-6.ISO/IEC 8859-7
ISO/IEC 8859-7:2003, Information technology — 8-bit single-byte coded graphic character sets — Part 7: Latin/Greek alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as Latin/Greek. It was designed to cover the modern Greek language. The original 1987 version of the standard had the same character assignments as the Greek national standard ELOT 928, published in 1986. The table in this article shows the updated 2003 version which adds three characters (0xA4: euro sign U+20AC, 0xA5: drachma sign U+20AF, 0xAA: Greek Ypogegrammeni U+037A). Microsoft has assigned code page 28597 a.k.a. Windows-28597 to ISO-8859-7 in Windows. IBM has assigned code page 813 to ISO 8859-7.
ISO-8859-7 is the IANA preferred charset name for this standard (formally the 1987 version, but in practice there is no problem using it for the current version, as the changes are pure additions to previously unassigned codes) when supplemented with the C0 and C1 control codes from ISO/IEC 6429.
Unicode is preferred for Greek in modern applications, especially as UTF-8 encoding on the Internet. Unicode provides many more glyphs for complete coverage, see Greek alphabet in Unicode and Ancient Greek Musical Notation for tables.ISO/IEC 8859-9
ISO/IEC 8859-9:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 9: Latin alphabet No. 5, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1989. It is informally referred to as Latin-5 or Turkish. It was designed to cover the Turkish language, designed as being of more use than the ISO/IEC 8859-3 encoding. It is identical to ISO/IEC 8859-1 except for these six replacements of Icelandic characters with characters unique to the Turkish alphabet:
ISO-8859-9 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. In modern applications Unicode and UTF-8 are preferred. 0.1% of all web pages use ISO-8859-9 in February 2016.Microsoft has assigned code page 28599 a.k.a. Windows-28599 to ISO-8859-9 in Windows. IBM has assigned Code page 920 to ISO-8859-9.Separator
Separator can refer to:
A mechanical device to separate fluids and solids, like
Cream separator, separates cream from milk
Demister (vapor), removal of liquid droplets entrained in a vapor stream
Separator (oil production), of an oil production plant
Vapor-liquid separator, separates a vapor-liquid mixture
a machine used to produce mechanically separated meat
The historic Swedish company name AB Separator, common ancestor of Alfa Laval and DeLaval
Air classifier, a mechanical device to separate components of air
Community separator, a term of urban planning
Separator (electricity), a porous or ion-conducting barrier used to separate anode and cathode in electrochemical systems, also known as diaphragm
Planar separator theorem, a theorem in graph theory
Vertex separator, a notion in graph theory
Geometric separator, a line that separates a set of geometric shapes to two subsets
A synonym for "generator" in category theory
A mathematical sign used to separate the integer part from the fractional part of a number. For example, the decimal point and the binary point
A synonym for "delimiter" in computer parlance
Orthodontic spacer, also known as orthodontic separators
Four of the C0 and C1 control codes used in digital character encoding
A song by Radiohead, off the 2011 album The King of LimbsStart symbol
Start symbol may refer to:
Start symbol (formal languages), the symbol in formal grammar from which rewriting of a string begins
_start symbol specifying an entry point in some formats of computer executables
▶️, a symbol used in media controls to start playing the media
Start of Heading or Start of Text symbols in C0 and C1 control codesThai Industrial Standard 620-2533
Thai Industrial Standard 620-2533, commonly referred to as TIS-620, is the most common character set and character encoding for the Thai language. The standard is published by the Thai Industrial Standards Institute (TISI), an organ of the Ministry of Industry under the Royal Thai Government, and is the sole official standard for encoding Thai in Thailand. The descriptive name of the standard is "Standard for Thai Character Codes for Computers" (Thai: รหัสสำหรับอักขระไทยที่ใช้กับคอมพิวเตอร์). "2533" refers to year 2533 of the Buddhist Era (1990), the year the present version of the standard was published; a previous revision, TIS 620-2529 (1986), is now obsolete.
TIS-620 is the IANA preferred charset name for TIS-620, and that charset name is used also for ISO/IEC 8859-11 (which adds a no-break space character at 0xA0, which is unassigned in TIS-620). When the IANA name is used the codes are supplemented with the C0 and C1 control codes from ISO/IEC 6429.
|MacOS code pages("scripts")|
|DOS code pages|
|IBM AIX code pages|
|IBM Apple MacIntoshemulations|
|IBM Adobe emulations|
|IBM DEC emulations|
|IBM HP emulations|
|Windows code pages|
|EBCDIC code pages|
|Unicode / ISO/IEC 10646|
|TeX typesetting system|
|Miscellaneous code pages|