ASMO 449

ASMO 449 is a 7-bit coded character set to encode the Arabic language.

History

This character set was devised by the now extinct[1] Arab Standardization and Metrology Organization in 1982[1] to be the 7-bit standard to be used in Arabic-speaking countries. The design of this character set is derived[2] from the 7-bit ISO 646 (version of 1973) but with modifications suited for the Arabic language. In code points ranging from 0x41 to 0x72 (hexadecimal), Latin letters were replaced with Arabic letters. Punctuation marks which were identical the in Latin and Arabic scripts remained the same, but where they differed (comma, semicolon, question mark), the Latin ones were replaced by Arabic ones. Only nominal letters are encoded, no preshaped forms of the letters, so shaping processing is required for display. This character set is not bidirectional and was intended to be used in right to left writing. Therefore, symmetrical punctuation marks ("(", ")", "<", ">", "[", "]", "{" and "}") appears as reversed (")", "(", ">", "<", "]", "[", "}" and "{").

ASMO 449 was registered in the International Register of Coded Character Sets as IR 089[2] in 1985 and approved as an ISO standard as ISO 9036[3] in 1987.

Character set

  Letter   Number   Punctuation   Symbol   Other   undefined

ASMO 449 (1982)
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
0_
0
NUL
0000
SOH
0001
STX
0002
ETX
0003
EOT
0004
ENQ
0005
ACK
0006
BEL
0007
BS
0008
HT
0009
LF
000A
VT
000B
FF
000C
CR
000D
SO
000E
SI
000F
1_
16
DLE
0010
DC1
0011
DC2
0012
DC3
0013
DC4
0014
NAK
0015
SYN
0016
ETB
0017
CAN
0018
EM
0019
SUB
001A
ESC
001B
FS
001C
GS
001D
RS
001E
US
001F
2_
32
SP
0020
!
0021
"
0022
#
0023
¤
00A4
%
0025
&
0026
'
0027
)
0029
(
0028
*
002A
+
002B
،
060C
-
002D
.
002E
/
002F
3_
48
0
0030
1
0031
2
0032
3
0033
4
0034
5
0035
6
0036
7
0037
8
0038
9
0039
:
003A
؛
061B
>
003E
=
003D
<
003C
؟
061F
4_
64
@
0040
ء
0621
آ
0622
أ
0623
ؤ
0624
إ
0625
ئ
0626
ا
0627
ب
0628
ة
0629
ت
062A
ث
062B
ج
062C
ح
062D
خ
062E
د
062F
5_
80
ذ
0630
ر
0631
ز
0632
س
0633
ش
0634
ص
0635
ض
0636
ط
0637
ظ
0638
ع
0639
غ
063A
]
005D
\
005C
[
005B
^
005E
_
005F
6_
96
ـ
0640
ف
0641
ق
0642
ك
0643
ل
0644
م
0645
ن
0646
ه
0647
و
0648
ى
0649
ي
064A
ً
064B
ٌ
064C
ٍ
064D
َ
064E
ُ
064F
7_
112
ِ
0650
ّ
0651
ْ
0652
}
007D
|
007C
{
007B
~
007E
DEL
007F

There is a variant, sometimes named ASMO 449+[4] which adds the characters NBS in 0x75, "ﹳ" in 0x76, "لآ" in 0x77, "لأ" in 0x78, "لإ" in 0x79 and "لا" in 0x7A.

Relationship with other character sets

ASMO 449 is a 7-bit character set. Although some encodings allocate this 7-bit character set in the upper part of the 8-bit character set, it should not be confused with ASMO 708. In the character sets that allocate ASMO 449 (or some variant of it) in the upper part of the 8-bit character set, the existence of apparently repeated characters is due to the fact that the characters in the lower part are for left-to-right script while the characters in the upper part are for right-to-left script. When ASMO 449 (or some variant of it) is allocated to the upper part of the 8-bit character set, it has Arabic digits.

  • Al-Arabi[4] adds the characters NBS in 0xF5, "-" in 0xF6, "÷" in 0xF7, "×" in 0xF8, "«" in 0xF9 and "»" in 0xFA, and replaces "ـ" with "`"; this character set is sometimes referred as Code Page 768 (not an official IBM code page).
  • DEC's DEC/8/ASMO[4] has the same repertoire and the same sequence of Arabic characters but dislocates them.
  • HP's Arabic-8[4] is also based on ASMO 449;
  • Apple's MacArabic adds French, German and Spanish characters in their typical code points from MacRoman, and adds letters for Persian and Urdu.
  • Apple's MacFarsi replaces the Arabic digits from MacArabic with Persian ones.
  • The Code Table 7[5] from MARC-8 allocates ASMO 449 in the lower part of the 8-bit character set and allocates the upper part with the Arabic Extension (ISO 11822 / IR 224).
  • Microsoft's Code page 709,[4] for MS-DOS, adds French and German characters in their typical code points from code page 437.

References

  1. ^ a b Le codage informatique de l’écriture arabe : d’ASMO 449 à Unicode et ISO/CEI 10646
  2. ^ a b "7-bit Arabic Code for Information Interchange, Arab standard ASMO-449, ISO 9036" (PDF). Archived from the original (PDF) on 2017-02-21. Retrieved 2017-02-20.
  3. ^ ISO 9036:1987
  4. ^ a b c d e Printronix ACA Emulation Programmer’s Reference Manual
  5. ^ Code Table 7

External links

Arab Standardization and Metrology Organization

The Arab Organization for Standardization and Metrology (French: Organisation arabe de normalisation et de métrologie, Spanish: Organización Arabe de Unificación de Normas y Metrologia), also known as Arab Organization for Standardization and Measures, was founded in 1965 as a specialized agency under the Arab League by the Council of Arab Economic Unity.

The organization's functions included offering technical advice to Arab states on systems of weights and measures; providing professional training and research on industrial production quality, metrology, test and inspection methods; and seeking standardization of technical terms and product specifications between member nations. Their first general committee was held on March 25, 1968.The organization was merged in the 1990s with other organizations to form the Arab Industrial Development and Mining Organization.

Code page

In computing, a code page is a character encoding and as such it is a specific association of a set of printable characters and control characters with unique numbers.

The term "code page" originated from IBM's EBCDIC-based mainframe systems, but Microsoft, SAP, and Oracle Corporation are among the few vendors which use this term. The majority of vendors identify their own character sets by a name. In the case when there is a plethora of character sets (like in IBM), identifying character sets through a number is a convenient way to distinguish them. Originally, the code page numbers referred to the page numbers in the IBM standard character set manual, a condition which has not held for a long time. Vendors that use a code page system allocate their own code page number to a character encoding, even if it is better known by another name; for example, UTF-8 has been assigned page numbers 1208 at IBM, 65001 at Microsoft, and 4110 at SAP.

Hewlett-Packard uses a similar concept in its HP-UX operating system and its Printer Command Language (PCL) protocol for printers (either for HP printers or not). The terminology, however, is different: What others call a character set, HP calls a symbol set, and what IBM or Microsoft call a code page, HP calls a symbol set code. HP developed a series of symbol sets, each with an associated symbol set code, to encode both its own character sets and other vendors’ character sets.

The multitude of character sets leads many vendors to recommend Unicode.

ISO/IEC 646

ISO/IEC 646 is the name of a set of ISO standards, described as Information technology — ISO 7-bit coded character set for information interchange and developed in cooperation with ASCII at least since 1964. Since its first edition in 1967 it has specified a 7-bit character code from which several national standards are derived.

ISO/IEC 646 was also ratified by ECMA as ECMA-6. The first version of ECMA-6 had been published in 1965, based on work the ECMA's Technical Committee TC1 had carried out since December 1960.Characters in the ISO/IEC 646 Basic Character Set are invariant characters. Since that portion of ISO/IEC 646, that is the invariant character set shared by all countries, specified only those letters used in the ISO basic Latin alphabet, countries using additional letters needed to create national variants of ISO 646 to be able to use their native scripts. Since transmission and storage of 8-bit codes was not standard at the time, the national characters had to be made to fit within the constraints of 7 bits, meaning that some characters that appear in ASCII do not appear in other national variants of ISO 646.

ISO/IEC 8859-6

ISO/IEC 8859-6:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 6: Latin/Arabic alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as Latin/Arabic. It was designed to cover Arabic. Only nominal letters are encoded, no preshaped forms of the letters, so shaping processing is required for display. It does not include the extra letters needed to write most Arabic-script languages other than Arabic itself (such as Persian, Urdu, etc.).

ISO-8859-6 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. The text is in logical order, so BiDi processing is required for display. Nominally ISO-8859-6 (code page 28596) is for "visual order", and ISO-8859-6-I (code page 38596) is for logical order. But in practice, and required for HTML and XML documents, ISO-8859-6 also stands for logical order text. There is also ISO-8859-6-E which supposedly requires directionality to be explicitly specified with special control characters; this latter variant is in practice unused. IBM has assigned code page 1089 to ISO 8859-6. It is an emulation for their AIX operating system.

Unicode is preferred over ISO-8859-6 in modern applications, especially on the Internet; meaning the dominant UTF-8 encoding for web pages (see also Arabic script in Unicode, for complete coverage, unlike for e.g. ISO-8859-6 or Windows 1256 that don't cover extras). 0.1% of all web pages use ISO-8859-6.

ISO standards by standard number
1–9999
10000–19999
20000+
Early telecommunications
ISO/IEC 8859
Bibliographic use
National standards
EUC
ISO/IEC 2022
MacOS code pages("scripts")
DOS code pages
IBM AIX code pages
IBM Apple MacIntoshemulations
IBM Adobe emulations
IBM DEC emulations
IBM HP emulations
Windows code pages
EBCDIC code pages
Platform specific
Unicode / ISO/IEC 10646
TeX typesetting system
Miscellaneous code pages
Related topics

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.