ISO 11940

ISO 11940 is an ISO standard for the transliteration of Thai characters, published in 1998 and updated in September 2003 and confirmed in 2008. An extension to this standard named ISO 11940-2 defines a simplified transcription based on it.

Consonants

Thai
ISO k k̄h ḳ̄h kh k̛h ḳh ng c c̄h ch s c̣h
 
Thai  
ISO ṭ̄h ṯh t̛h d t t̄h th ṭh n
 
Thai  
ISO b p p̄h ph f p̣h m
 
Thai
ISO y r v l ł w ṣ̄ s̛̄ x

The transliteration of the pure consonants is derived from their usual pronunciation as an initial consonant. An unmarked h is used to form digraphs denoting aspirated consonants. High and low pairs of consonants are systematically differentiated by applying a macron to the high class consonant. Further differentiation of consonants with identical phonetic function is obtained by leaving the most frequent unmarked, marking the second commonest by a dot below, marking the third commonest by a horn, and marking the fourth commonest by underlining. The use of a dot below has a similar effect to the Indological practice of distinguishing retroflex consonants by a dot below, but there are subtle differences – it is the transliterations of ธ tho thong and ศ so sala that are dotted below, not those of the corresponding retroflex consonants. The transliterations of consonants should be entered in the order base letter, macron if any, and then dot below, horn or "macron below".

Only three consonants have the horn in their transliteration, ฅ kho khon, ฒ tho phuthao and ษ so ruesi, and only one consonant has an underline, ฑ tho nang montho.

Vowels

Thai –ั  ำ –ิ –ี –ึ –ื –ุ –ู ฤๅ ฦๅ
ISO a ā å i ī ụ̄ u ū e æ o ı v ł łɨ y w x

The letter å is the only precomposed character specified in the output of transliteration.

Lakkhangyao (ๅ) has been shown only in combination with the vowel letters ฤ and ฦ. The standard simply lists ฤ and ฦ with the consonants and lakkhangyao with the vowels. An isolated lakkhangyao would also be transliterated by a small letter "i" with stroke (ɨ), but such should not occur in Thai, Pāli, or Sanskrit.

The transliterations of ว wo waen and อ o ang have been included here because of their use as complete vowel symbols, but their transliteration does not depend on how they are being used and the standard simply lists them with the consonants.

Compound vowel symbols are transliterated in accordance with their constituents.

Other combining marks

Thai –่ –้ –๊ –๋ –็ –์ –๎ –ํ –ฺ
ISO –̀ –̂ –́ –̌ –̆ –̒ ~ –̊ –̥

Note that yamakkan (–๎) is represented by a spacing tilde, not a superscript tilde.

Punctuation and Digits

Thai
ISO « ǂ § ǀ ǁ » 0 1 2 3 4 5 6 7 8 9

ISO 11940:1998 distinguishes the abbreviation symbol paiyannoi (ฯ) from the sentence terminator angkhandiao (ฯ), even though neither the national character standard TIS 620-2533 nor Unicode Version 5.0 distinguishes them. Paiyannoi is transliterated as ǂ and angkhandiao is transliterated as ǀ. Note that paiyannoi, angkhandiao and angkhankhu (๚) are transliterated by the letters used for click consonants, not by double dagger, vertical bars or dandas.

Character Sequencing

In general characters are transliterated from left to right and, where characters have the same horizontal position, from top to bottom. The vertical sequencing is in fact simply specified as tone marks and thanthakhat (–์) preceding any other marks above or below the consonant. The standard denies at the end of Section 4.2 that the combination of sara u (◌ุ, ◌ู) and nikkhahit (◌ํ) can occur and then gives an example of it when specifying the transliteration of nikkhahit, but does not show the transliteration of the combination. The effect of these rules is that, except for nikkhahit, all the non-vowel marks attached to a consonant in Thai are attached to the consonant in the Roman transliteration.

The standard concedes that attempting to transpose preposed vowels and consonants may be comforting to those used to the Roman alphabet, but recommends that preposed vowels not be transposed.

For example, ภาษาไทย (RTGSPhasa Thai) should be transliterated to p̣hās̛̄āịthy and เชียงใหม่ (RTGSChiang Mai) to echīyngıh̄m̀.

Variations

Causes

The standard specifies the order in which the accents should be typed, but not all input systems will record accents in the order in which they are typed. Unicode specifies two normalised forms for letters with multiple accents, and transliterated text is highly likely to be stored in one of these forms. This complicates automatic back-transliteration. As Unicode-compliant processes must handle such variations correctly, the transliterations on this page have been chosen for ease of display – present day rendering systems may display equivalent forms differently.

Many fonts display novel combinations of consonants and accents badly. For example, the Institute of the Estonian Language publishes on the web an explanation of the application of the standard to Thai, and with one exception this seems to be a comply with the standard. The exception is that, except for the macron, accents over consonants are actually offset to the right, giving the impression that they have been entered as the corresponding non-combining characters. The standard specifies the transliterations in code points, but someone working from this free explanation could easily deduce that the spacing forms of the tone accents should be used.

ICU (CLDR 1.4.1)

The ICU implementation, recorded in Version 1.4.1 of the Common Locale Data Repository sponsored by Unicode,[1] uses a prime instead of a horn in the transliteration of consonants. This affects the transliteration of ฅ kho khon, ฒ tho phuthao and ษ so bo ruesi. ฏ to patak is also transliterated differently, as rather than .

This implementation transliterates ำ as  instead of å to avoid ambiguity with the hypothetical Thai script sequence ะํ (sara a, nikkhahit). The ICU implementation transliterates ฺ phinthu as ˌinstead of to avoid problems with Unicode normalisation. This has the side effect of improving legibility when applied to an underdotted consonant.

The ICU implementation transliterates ฯ paiyannoi as (double dagger) and angkhankhu as || (two ASCII vertical bars). As the ICU implementation uses Unicode, it cannot reliably distinguish angkhandiao from paiyannoi without a semantic analysis, and makes no such attempt.

The character sequencing of the ICU implementation is different. It transposes preposed vowels with the following consonant, and processes the marks on a consonant in the order in which they are stored in memory. (Most Thai input methods ensure that the marks are stored in bottom to top order.) It does not transpose preposed vowels with complete consonant clusters; consonant clusters cannot be identified with complete accuracy, and transposing vowels with clusters would require an additional symbol to permit reliable conversion back to the Thai script.

For example, under this implementation ภาษาไทย transliterates to p̣hās̄ʹāthịy and เชียงใหม่ to cheīyngh̄ım̀.

Finally, this implementation generates transliterations in Unicode Normalisation Form C (NFC).

See also

References

  1. ^ http://unicode.org/Public/cldr/1.4.1/core.zip files transforms/ThaiLogical-Latin.xml and transforms/Thai-ThaiLogical.xml (used by ICU's transliterators "Thai-Latin" and "Latin-Thai")

External links

ISO 11940-2

ISO 11940-2 is an ISO standard for a simplified transcription of the Thai language into Latin characters.

The full standard ISO 11940-2:2007 includes pronunciation rules and conversion tables of Thai consonants and vowels. It is a sequel to ISO 11940, describing a way to transform its transliteration into a broad transcription.

List of ISO romanizations

List of ISO standards for transliterations and transcriptions (or romanizations):

ISO 9 — Cyrillic

ISO 233 — Arabic

ISO 259 — Hebrew

ISO 843 — Greek

ISO 3602 — Japanese (1989, last reviewed 2013)

ISO 7098 — Chinese

ISO 9984 — Georgian

ISO 9985 — Armenian

ISO 11940 — Thai

ISO 11940-2 — Thai (simplified)

ISO 11941 — Korean (different systems for North and South Korea – withdrawn in 2013)

ISO 15919 — Indic scripts

List of International Organization for Standardization standards, 11000-11999

This is a list of published International Organization for Standardization (ISO) standards and other deliverables. For a complete and up-to-date list of all the ISO standards, see the ISO catalogue.The standards are protected by copyright and most of them must be purchased. However, about 300 of the standards produced by ISO and IEC's Joint Technical Committee 1 (JTC1) have been made freely and publicly available.

List of Latin-script letters

This is a list of letters of the Latin script. The definition of a Latin-script letter for this list is a character encoded in the Unicode Standard that has a script property of 'Latin' and the general category of 'Letter'. An overview of the distribution of Latin-script letters in Unicode is given in Latin script in Unicode.

Romanization

Romanization or romanisation, in linguistics, is the conversion of writing from a different writing system to the Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, and transcription, for representing the spoken word, and combinations of both. Transcription methods can be subdivided into phonemic transcription, which records the phonemes or units of semantic meaning in speech, and more strict phonetic transcription, which records speech sounds with precision.

Romanization of Thai

There are many systems for the romanization of the Thai language, i.e. representing the language in Latin script. These include systems of transliteration, and transcription.

The most seen system in public space is Royal Thai General System of Transcription (RTGS)—the official scheme promulgated by the Royal Thai Institute. It is based on spoken Thai, but disregards tone, vowel length and a few minor sound distinctions.

The international standard ISO 11940 is a transliteration system, preserving all aspects of written Thai adding diacritics to the Roman

letters.

Its extension ISO 11940-2 defines a simplified transcription reflecting the spoken language. It is almost identical to RTGS.

Libraries in English-speaking countries use the ALA-LC Romanization.

In practice, often non-standard and inconsistent romanizations are used, especially for proper nouns and personal names. This is reflected, for example, in the name Suvarnabhumi Airport, which is spelled based on direct transliteration of the name's Sanskrit root.

Language learning books often use their own proprietary systems, none of which are used in Thai public space.

Royal Thai General System of Transcription

The Royal Thai General System of Transcription (RTGS) is the official system for rendering Thai words in the Latin alphabet. It was published by the Royal Institute of Thailand.It is used in road signs and government publications and is the closest method to a standard of transcription for Thai, but its use, by even the government, is inconsistent. The system is almost identical to the one that is defined by ISO 11940-2.

Stupa

A stupa (Sanskrit: "heap") is a mound-like or hemispherical structure containing relics (such as śarīra – typically the remains of Buddhist monks or nuns) that is used as a place of meditation. A related architectural term is a chaitya, which is a prayer hall or temple containing a stupa.

In Buddhism, circumambulation or pradakhshina has been an important ritual and devotional practice since the earliest times, and stupas always have a pradakhshina path around them.

Thai language

Thai, Central Thai or Ayutthaya or Siamese (Thai: ภาษาไทย), is the sole official and national language of Thailand and the first language of the Central Thai people and vast majority of Thai of Chinese origin. It is a member of the Tai group of the Kra–Dai language family. Over half of Thai vocabulary is derived from or borrowed from Pali, Sanskrit, Mon and Old Khmer. It is a tonal and analytic language, similar to Chinese and Vietnamese.

Thai has a complex orthography and system of relational markers. Spoken Thai is mutually intelligible with Lao and Isan, fellow Southwestern Tai languages, to a significantly high degree where its speakers are able to effectively communicate each speaking their respective language. These languages are written with slightly different scripts but are linguistically similar and effectively form a dialect continuum.

Thai script

The Thai script (Thai: อักษรไทย; RTGS: akson thai; [ʔàksɔ̌ːn tʰāj] listen) is the abugida used to write Thai, Southern Thai and many other languages spoken in Thailand. The Thai alphabet itself (as used to write Thai) has 44 consonant symbols (Thai: พยัญชนะ, phayanchana), 15 vowel symbols (Thai: สระ, sara) that combine with 28 vowel symbols and four tone diacritics (Thai: วรรณยุกต์ or วรรณยุต, wannayuk or wannayut) to create characters mostly representing syllables.

Although commonly referred to as the "Thai alphabet", the script is in fact not a true alphabet but an abugida, a writing system in which the full characters represent consonants with diacritical marks for vowels; the absence of a vowel diacritic gives an implied 'a' or 'o'. Consonants are written horizontally from left to right, with vowels arranged above, below, to the left, or to the right of the corresponding consonant, or in a combination of positions.

Thai has its own set of Thai numerals that are based on the Hindu-Arabic numeral system (Thai: เลขไทย, lek thai), but the standard western Hindu-Arabic numerals (Thai: เลขฮินดูอารบิก, lek hindu arabik) are mainly used except for government documents and the license plates of military vehicles.

Wade–Giles

Wade–Giles (), sometimes abbreviated Wade, is a romanization system for Mandarin Chinese. It developed from a system produced by Thomas Wade, during the mid-19th century, and was given completed form with Herbert A. Giles's Chinese–English Dictionary of 1892.

Wade–Giles was the system of transcription in the English-speaking world for most of the 20th century. Wade-Giles is based on Beijing dialect, whereas Nanking dialect-based romanization systems were in common use until the late 19th century. Both were used in postal romanizations (still used in some place-names). In mainland China it has been mostly replaced by the Hanyu Pinyin romanization system, with exceptions for some proper nouns. Taiwan has kept the Wade–Giles romanization of some geographical names (for example Kaohsiung) and many personal names (for example Chiang Ching-kuo).

Ł

Ł or ł, described in English as L with stroke, is a letter of the West Slavic (Polish, Kashubian, and Sorbian), Łacinka (Latin Belarusian), Łatynka (Latin Ukrainian), Wymysorys, Navajo, Dene Suline, Inupiaq, Zuni, Hupa, and Dogrib alphabets, several proposed alphabets for the Venetian language, and the ISO 11940 romanization of the Thai alphabet. In Slavic languages, it represents the continuation of Proto-Slavic non-palatal l (dark L), except in Polish, Kashubian, and Sorbian, where it evolved further into /w/. In most non-European languages, it represents a voiceless alveolar lateral fricative or similar sound.

ISO standards by standard number
1–9999
10000–19999
20000+

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.