Code page 936 (Microsoft Windows)

Windows Code page 936 (abbreviated MS936, Windows-936 or (ambiguously) CP936),[1] is Microsoft's character encoding for simplified Chinese, one of the four DBCSs for East Asian languages. Originally, Windows-936 covered GB 2312 (in its EUC-CN form), but it was expanded to cover most of GBK with the release of Windows 95.

IBM's Code page 936[2] is a different encoding for Simplified Chinese, although International Components for Unicode does not include an IBM-936 codec, and uses the Windows code page for the "cp936" label.[1] IBM's code page for GBK coverage is Code page 1386 (CP1386 or IBM-1386), which is defined as a combination of the single byte Code page 1114 and the double byte Code page 1385.[3]

It was superseded by code page 54936 (GB 18030), but as of 2014 was still prevalent in use. The Windows command prompt uses CP936 as the default code page for simplified Chinese installations, although part of the GB 18030 was made mandatory for all software products sold in China. In 2002, the IANA Internet name GBK was registered with Windows-936's mapping,[4][5] making it the de facto GBK definition on the Internet.

The concepts of "Windows-936", "GBK",[a] "GB2312" and "EUC-CN" are sometimes confused in various software products. Code pages MS936 and 1386 are not identical to GBK because a code page encodes characters, whereas GBK only defines code points. In addition, the Euro sign (€), encoded as 0x80 in both Windows-936 and IBM-1386, is not defined in GBK. On the other hand, 95 characters defined in GBK were initially not encoded into Windows-936.

This is partly resolved in later versions of Windows and, as in Windows 7, all GBK characters not in the Unicode BMP Private Use Area can be displayed using code page 936, but encoding the 95 characters was still not supported as of 2014. However, "CP936" and "GBK" are often used interchangeably because of the popularity of Microsoft products on the Chinese market when GBK was then published.

Since GBK superseded GB 2312 long ago, these two terms have also become virtually equivalent to many users, so "Windows-936", "GBK" and "GB 2312" are misunderstood by many to mean the same thing while they actually differ significantly. Instead of supporting precisely EUC-CN / GB 2312, most modern-day Windows-based software products mean partial support for GBK via Windows-936 when they use the term "GB 2312" as a character encoding option. This can be observed in products such as Microsoft Internet Explorer and Notepad++.

Notes

  1. ^ GBK 1.0

References

  1. ^ a b "windows-936-2000 (alias cp936)". ICU Demonstration - Converter Explorer. International Components for Unicode.
  2. ^ "Coded character set identifiers - CCSID 936". IBM Globalization. IBM. Archived from the original on 2014-12-01.
  3. ^ "Coded character set identifiers - CCSID 1386". IBM. Archived from the original on 2014-11-29.
  4. ^ "Character Sets". Retrieved 3 October 2016.
  5. ^ Application of IANA Charset Registration for GBK

External links

Windows-936:

IBM-1386:

Code page 936

Code page 936 may refer to one of two character encoding for Simplified Chinese:

Code page 936 (IBM), a combination of code pages 903 and 928

Code page 936 (Microsoft Windows), largely equivalent to code page 1386

Early telecommunications
ISO/IEC 8859
Bibliographic use
National standards
EUC
ISO/IEC 2022
MacOS code pages("scripts")
DOS code pages
IBM AIX code pages
IBM Apple MacIntoshemulations
IBM Adobe emulations
IBM DEC emulations
IBM HP emulations
Windows code pages
EBCDIC code pages
Platform specific
Unicode / ISO/IEC 10646
TeX typesetting system
Miscellaneous code pages
Related topics

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.