ISO/IEC 8859-2

ISO/IEC 8859-2:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as "Latin-2". It is generally intended for Central[1] or "Eastern European" languages that are written in the Latin script. Note that ISO/IEC 8859-2 is very different from code page 852 (MS-DOS Latin 2, PC Latin 2) which is also referred to as "Latin-2" in Czech and Slovak regions.[2] Code page 912 is an extension.

ISO-8859-2 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. 0.1% of all web pages use ISO 8859-2 in December 2018.[3] Microsoft has assigned code page 28592 a.k.a. Windows-28592 to ISO-8859-2 in Windows. IBM assigned Code page 1111 to ISO 8859-2.

Windows-1250 is similar to ISO-8859-2 and has all the printable characters it has and more. However a few of them are rearranged (unlike Windows-1252, which keeps all printable characters from ISO-8859-1 in the same place).

These code values can be used for the following languages:

It can also be used for Romanian, but it is not well suited for that language, due to lacking letters s and t with commas below, although it provides s and t with similar-looking cedillas. These letters were unified in the first versions of the Unicode standard, meaning that the appearance with cedilla or with a comma was treated as a glyph choice rather than as separate characters; fonts intended for use with Romanian should therefore, in theory, have characters with a comma below at those code points.

Microsoft did not really provide such fonts for computers sold in Romania. Still, ISO/IEC 8859-2 and Windows-1250 (with the same problem) have been heavily used for Romanian. Unicode subsequently disunified the comma variants from the cedilla variants, and has since taken the lead for web pages, which however often have s and t with cedilla anyway. Unicode notes as of 2014 that disunifying the letters with comma below was a mistake, causing corruptions of Romanian data: pre-existing data and input methods would still contain the older cedilla codepoints, complicating text searching.

Code page layout

In the following table characters are shown together with their corresponding Unicode code points.

ISO/IEC 8859-2 (Latin-2)
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
0_
0
1_
16
2_
32
SP
0020
!
0021
"
0022
#
0023
$
0024
%
0025
&
0026
'
0027
(
0028
)
0029
*
002A
+
002B
,
002C
-
002D
.
002E
/
002F
3_
48
0
0030
1
0031
2
0032
3
0033
4
0034
5
0035
6
0036
7
0037
8
0038
9
0039
:
003A
;
003B
<
003C
=
003D
>
003E
?
003F
4_
64
@
0040
A
0041
B
0042
C
0043
D
0044
E
0045
F
0046
G
0047
H
0048
I
0049
J
004A
K
004B
L
004C
M
004D
N
004E
O
004F
5_
80
P
0050
Q
0051
R
0052
S
0053
T
0054
U
0055
V
0056
W
0057
X
0058
Y
0059
Z
005A
[
005B
\
005C
]
005D
^
005E
_
005F
6_
96
`
0060
a
0061
b
0062
c
0063
d
0064
e
0065
f
0066
g
0067
h
0068
i
0069
j
006A
k
006B
l
006C
m
006D
n
006E
o
006F
7_
112
p
0070
q
0071
r
0072
s
0073
t
0074
u
0075
v
0076
w
0077
x
0078
y
0079
z
007A
{
007B
|
007C
}
007D
~
007E
8_
128
9_
144
A_
160
NBSP
00A0
Ą
0104
˘
02D8
Ł
0141
¤
00A4
Ľ
013D
Ś
015A
§
00A7
¨
00A8
Š
0160
Ş
015E
Ť
0164
Ź
0179
SHY
00AD
Ž
017D
Ż
017B
B_
176
°
00B0
ą
0105
˛
02DB
ł
0142
´
00B4
ľ
013E
ś
015B
ˇ
02C7
¸
00B8
š
0161
ş
015F
ť
0165
ź
017A
˝
02DD
ž
017E
ż
017C
C_
192
Ŕ
0154
Á
00C1
Â
00C2
Ă
0102
Ä
00C4
Ĺ
0139
Ć
0106
Ç
00C7
Č
010C
É
00C9
Ę
0118
Ë
00CB
Ě
011A
Í
00CD
Î
00CE
Ď
010E
D_
208
Đ
0110
Ń
0143
Ň
0147
Ó
00D3
Ô
00D4
Ő
0150
Ö
00D6
×
00D7
Ř
0158
Ů
016E
Ú
00DA
Ű
0170
Ü
00DC
Ý
00DD
Ţ
0162
ß
00DF
E_
224
ŕ
0155
á
00E1
â
00E2
ă
0103
ä
00E4
ĺ
013A
ć
0107
ç
00E7
č
010D
é
00E9
ę
0119
ë
00EB
ě
011B
í
00ED
î
00EE
ď
010F
F_
240
đ
0111
ń
0144
ň
0148
ó
00F3
ô
00F4
ő
0151
ö
00F6
÷
00F7
ř
0159
ů
016F
ú
00FA
ű
0171
ü
00FC
ý
00FD
ţ
0163
˙
02D9

  Letter   Number   Punctuation   Symbol   Other   undefined   Differences from ISO-8859-1

See also

References

  1. ^ "Microsoft Outlook Message Encodings".
  2. ^ The Czech and Slovak Character Encoding Mess Explained
  3. ^ https://w3techs.com/technologies/history_overview/character_encoding

External links

2N

2N or 2-N may refer to:

2N or 2°N, the 2nd parallel north latitude

MI 2N, a type of electric multiple unit running on the French RER rail network

2N, a prefix labelling certain JEDEC transistors, notably the 2N2222

2N, an indicator of a redundancy level in (for example) an uninterruptible power supply configuration

Powers of 2 (2n)

In genetics, 2n = x refers to a diploid chromosome number of x

NJ 2-N; see New Jersey Route 17

MI 2N series double-decker train; see RER A

HP 2N, ISO/IEC 8859-2 character set on printers by Hewlett-Packard

Caron

A caron (), háček or haček ( or ; plural háčeks or háčky) also known as a hachek, wedge, check, inverted circumflex, inverted hat, is a diacritic ( ˇ ) commonly placed over certain letters in the orthography of some Baltic, Slavic, Finnic, Samic, Berber, and other languages to indicate a change in the related letter's pronunciation (c > č; [ts] > [tʃ]).

The use of the haček differs according to the orthographic rules of a language. In most Slavic and European languages it indicates present or historical palatalization, iotation, or postalveolar articulation. In Salishan languages, it often represents a uvular consonant (x vs. x̌ ; [x] vs. [χ])

When placed over vowels symbols, the caron can indicate a contour tone, for instance the falling and then rising tone in the Pinyin romanization of Mandarin Chinese.

It is also used to decorate symbols in mathematics, where it is often pronounced ("check").

It looks similar to a breve (˘), but has a sharp tip, like an inverted circumflex (ˆ), while a breve is rounded.

The left (downward) stroke is usually thicker than the right (upward) stroke in serif typefaces.

Code page 852

Code page 852 (also known as CP 852, IBM 00852, OEM 852 (Latin II), MS-DOS Latin 2) is a code page used under DOS to write Central European languages that use Latin script (such as Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian, Slovak or Slovene).

Note that code page 852 (DOS Latin 2) is very different from ISO/IEC 8859-2 (ISO Latin-2), although both are informally referred to as "Latin-2" in different language regions.Some of the box drawing characters of the original DOS code page 437 were sacrificed in order to put in more accented letters (all printable characters from ISO 8859-2 are included). These changes caused display glitches in DOS applications that made use of the box drawing characters to display a GUI-like surface in text mode (e.g. Norton Commander). Several local encodings were invented to avoid the problem, for example the Kamenický encoding for Czech and Slovak.

Code page 912

Code page 912 (also known as CP 912, IBM 00912) is a code page used under IBM AIX and DOS to write the Albanian, Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Slovak, Slovene, and Sorbian languages. It is an extension of ISO/IEC 8859-2.

ISO/IEC 8859

ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. The ISO working group maintaining this series of standards has been disbanded.

ISO/IEC 8859 parts 1, 2, 3, and 4 were originally Ecma International standard ECMA-94.

ISO/IEC 8859-1

ISO/IEC 8859-1:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. ISO 8859-1 encodes what it refers to as "Latin alphabet no. 1," consisting of 191 characters from the Latin script. This character-encoding scheme is used throughout the Americas, Western Europe, Oceania, and much of Africa. It is also commonly used in most standard romanizations of East-Asian languages. It is the basis for most popular 8-bit character sets and the first block of characters in Unicode.

ISO-8859-1 is (according to the standards at least) the default encoding of documents delivered via HTTP with a MIME type beginning with "text/" (HTML5 changed this to Windows-1252). As of March 2019, 3.4% of all web sites claim to use ISO 8859-1. However, this includes an unknown number of pages actually using Windows-1252 and/or UTF-8, both of which are commonly recognized by browsers despite the character set tag.

It is the default encoding of the values of certain descriptive HTTP headers, and defines the repertoire of characters allowed in HTML 3.2 documents (HTML 4.0 uses Unicode), and is specified by many other standards. This and similar sets are often assumed to be the encoding of 8-bit text on Unix and Microsoft Windows if there is no byte order mark (BOM), this is only gradually being changed to UTF-8.

ISO-8859-1 is the IANA preferred name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. The following other aliases are registered: iso-ir-100, csISOLatin1, latin1, l1, IBM819. Code page 28591 a.k.a. Windows-28591 is used for it in Windows. IBM calls it code page 819 or CP819. Oracle calls it WE8ISO8859P1.

List of International Organization for Standardization standards, 8000-8999

This is a list of published International Organization for Standardization (ISO) standards and other deliverables. For a complete and up-to-date list of all the ISO standards, see the ISO catalogue.The standards are protected by copyright and most of them must be purchased. However, about 300 of the standards produced by ISO and IEC's Joint Technical Committee 1 (JTC1) have been made freely and publicly available.

List of Latin-script alphabets

The tables below summarize and compare the letter inventory of some of the Latin-script alphabets. In this article, the scope of the word "alphabet" is broadened to include letters with tone marks, and other diacritics used to represent a wide range of orthographic traditions, without regard to whether or how they are sequenced in their alphabet or the table.

Montenegrin language

Montenegrin (; црногорски / crnogorski) is the normative variety of the Serbo-Croatian language mainly used by Montenegrins and the official language of Montenegro. Montenegrin is based on the most widespread dialect of Serbo-Croatian, Shtokavian, more specifically on Eastern Herzegovinian, which is also the basis of Standard Croatian, Serbian, and Bosnian.Montenegro's language has historically and traditionally been called either Montenegrin, "Our language", or Serbian. The idea of a standardized Montenegrin standard language separate from Serbian appeared in the 1990s during the breakup of Yugoslavia, through proponents of Montenegrin independence from Serbia. Montenegrin became the official language of Montenegro with the ratification of a new constitution on 22 October 2007.

The Montenegrin standard is still emerging. Its orthography was established on 10 July 2009 with the addition of two letters to the alphabet. Their usage remained controversial and they achieved only limited public acceptance, along with some proposed alternative spellings. They had been used for official documents since 2009, but in February 2017, the Assembly of Montenegro removed them from any type of governmental documentation.

Romanization of Armenian

There are various systems of romanization of the Armenian alphabet.

S-comma

S-comma (majuscule: Ș, minuscule: ș) is a letter which is part of the Romanian alphabet, used to represent the sound /ʃ/, the voiceless postalveolar fricative (like sh in shoe).

Slovak orthography

The first Slovak orthography was proposed by Anton Bernolák (1762–1813) in his Dissertatio philologico-critica de litteris Slavorum, used in the six-volume Slovak-Czech-Latin-German-Hungarian Dictionary (1825–1927) and used primarily by Slovak Catholics.

The standard orthography of the Slovak language is immediately based on the standard developed by Ľudovít Štúr in 1844 and reformed by Martin Hattala in 1851 with the agreement of Štúr. The then-current (1840s) form of the central Slovak dialect was chosen as the standard. It uses the Latin script with small modifications that include the four diacritics (ˇ, ´, ¨, ˆ) placed above certain letters. After Hattala's reform, the Slovak language remained mostly unchanged.

Slovene alphabet

The Slovene alphabet (Slovene: slovenska abeceda, pronounced [slɔˈʋèːnska abɛˈtséːda] or slovenska gajica [- ˈɡáːjitsa]) is an extension of the Latin script and is used in the Slovene language. The standard language uses a Latin alphabet which is a slight modification of the Serbo-Croatian Gaj's Latin alphabet, consisting of 25 lower- and upper-case letters:

Source: Omniglot

The following Latin letters are also found in names of non-Slovene origin: Ć (mehki č), Đ (mehki dž), Q (ku), W (dvojni ve), X (iks), and Y (ipsilon), Ä, Ë, Ö, Ü.

Vertical bar

The vertical bar ( | ) is a computer character and glyph with various uses in mathematics, computing, and typography. It has many names, often related to particular meanings: Sheffer stroke (in logic), verti-bar, vbar, stick, vertical line, vertical slash, bar, pike, or pipe, and several variants on these names. It is occasionally considered an allograph of broken bar (see below).

Early telecommunications
ISO/IEC 8859
Bibliographic use
National standards
EUC
ISO/IEC 2022
MacOS code pages("scripts")
DOS code pages
IBM AIX code pages
IBM Apple MacIntoshemulations
IBM Adobe emulations
IBM DEC emulations
IBM HP emulations
Windows code pages
EBCDIC code pages
Platform specific
Unicode / ISO/IEC 10646
TeX typesetting system
Miscellaneous code pages
Related topics

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.