The Uyghur or Uighur language (/ˈwiːɡər/ ئۇيغۇر تىلى, Уйғур тили, Uyghur tili, Uyƣur tili or ئۇيغۇرچە, Уйғурчә, Uyghurche, Uyƣurqə), formerly known as Eastern Turki, is a Turkic language with 10 to 25 million speakers, spoken primarily by the Uyghur people in the Xinjiang Uyghur Autonomous Region of Western China. Significant communities of Uyghur-speakers are located in Kazakhstan and Uzbekistan, and various other countries have Uyghur-speaking expatriate communities. Uyghur is an official language of the Xinjiang Uyghur Autonomous Region, and is widely used in both social and official spheres, as well as in print, radio, and television, and is used as a common language by other ethnic minorities in Xinjiang.
Uyghur belongs to the Karluk branch of the Turkic language family, which also includes languages such as Uzbek. Like many other Turkic languages, Uyghur displays vowel harmony and agglutination, lacks noun classes or grammatical gender, and is a left-branching language with subject–object–verb word order. More distinctly Uyghur processes include, especially in northern dialects, vowel reduction and umlauting. In addition to influence of other Turkic languages, Uyghur has historically been influenced strongly by Persian and Arabic, and more recently by Mandarin Chinese and Russian.
The modified Arabic-derived writing system is the most common and the only standard in China, although other writing systems are used for auxiliary and historical purposes. Unlike most Arabic-derived scripts, the Uyghur Arabic alphabet has mandatory marking of all vowels due to modifications to the original Perso-Arabic script made in the 20th century. Two Latin and one Cyrillic alphabet are also used, though to a much lesser extent. The Arabic and Latin alphabets both have 32 characters.
Kagan Arik wrote that Modern Uyghur is not descended from Old Uyghur, rather, it is a descendant of the Karluk language spoken by the Kara-Khanid Khanate. According to Gerard Clauson, Western Yugur is considered to be the true descendant of Old Uyghur, and is also called "Neo-Uyghur". Modern Uyghur is not a descendant of Old Uyghur, but is descended from the Xākānī language described by Mahmud al-Kashgari in Dīwānu l-Luġat al-Turk. According to Frederik Coene, Modern Uyghur and Western Yugur belong to entirely different branches of the Turkic language family, respectively the southeastern Turkic languages and the northeastern Turkic languages. The Western Yugur language, although in geographic proximity, is more closely related to the Siberian Turkic languages in Siberia. Robert Dankoff wrote that the Turkic language spoken in Kashgar and used in Kara Khanid works was Karluk, not (Old) Uyghur.
Robert Barkley Shaw wrote, "In the Turkish of Káshghar and Yarkand (which some European linguists have called Uïghur, a name unknown to the inhabitants of those towns, who know their tongue simply as Túrki), ... This would seem in many case to be a misnomer as applied to the modem language of Kashghar". Sven Hedin wrote, "In these cases it would be particularly inappropriate to normalize to the East Turkish literary language, because by so doing one would obliterate traces of national elements which have no immediate connection with the Kaschgar Turks, but on the contrary are possibly derived from the ancient Uigurs".
Probably around 1077, a scholar of the Turkic languages, Mahmud al-Kashgari from Kashgar in modern-day Xinjiang, published a Turkic language dictionary and description of the geographic distribution of many Turkic languages, Dīwān ul-Lughat al-Turk (English: Compendium of the Turkic Dialects; Uyghur: تۈركى تىللار دىۋانى Türki Tillar Diwani). The book, described by scholars as an "extraordinary work," documents the rich literary tradition of Turkic languages; it contains folk tales (including descriptions of the functions of shamans) and didactic poetry (propounding "moral standards and good behaviour"), besides poems and poetry cycles on topics such as hunting and love, and numerous other language materials. Other Kara-Khanid writers wrote works in the Turki Karluk Khaqani language. Yusuf Khass Hajib wrote the Kutadgu Bilig. Ahmad bin Mahmud Yukenaki (Ahmed bin Mahmud Yükneki) (Ahmet ibn Mahmut Yükneki) (Yazan Edib Ahmed b. Mahmud Yükneki) (w:tr:Edip Ahmet Yükneki) wrote the Hibat al-ḥaqāyiq (هبة الحقايق) (Hibet-ül hakayik) (Hibet ül-hakayık) (Hibbetü'l-Hakaik) (Atebetüʼl-hakayik) (w:tr:Atabetü'l-Hakayık).
Middle Turkic languages, through the influence of Perso-Arabic after the 13th century, developed into the Chagatai language, a literary language used all across Central Asia until the early 20th century. After Chaghatai fell into extinction, the standard versions of Uyghur and Uzbek were developed from dialects in the Chagatai-speaking region, showing abundant Chaghatai influence. Uyghur language today shows considerable Persian influence as a result from Chagatai, including numerous Persian loanwords.
Modern Uyghur religious literature includes the Taẕkirah, biographies of Islamic religious figures and saints. The Taẕkirah is a genre of literature written about Sufi Muslim saints in Altishahr. Written sometime in the period between 1700 and 1849, the Eastern Turkic language (modern Uyghur) Taẕkirah of the Four Sacrificed Imams provides an account of the Muslim Karakhanid war against the Khotanese Buddhists, containing a story about Imams, from Mada'in city (possibly in modern-day Iraq) came 4 Imams who travelled to help the Islamic conquest of Khotan, Yarkand, and Kashgar by Yusuf Qadir Khan, the Qarakhanid leader. The shrines of Sufi Saints are revered in Altishahr as one of Islam's essential components and the tazkirah literature reinforced the sacredness of the shrines. Anyone who does not believe in the stories of the saints is guaranteed hellfire by the tazkirahs. It is written, "And those who doubt Their Holinesses the Imams will leave this world without faith, and on Judgement Day their faces will be black ..." in the Tazkirah of the Four Sacrificed Imams. Shaw translated extracts from the Tazkiratu'l-Bughra on the Muslim Turki war against the "infidel" Khotan. The Turki-language Tadhkirah i Khwajagan was written by M. Sadiq Kashghari. Historical works like the Tārīkh-i amniyya and Tārīkh-i ḥamīdi were written by Musa Sayrami.
Shaw and Christian missionaries such as George W. Hunter (missionary), Johannes Avetaranian, Magnus Bäcklund, Nils Fredrik Höijer, Father Hendricks, Josef Mässrur, Anna Mässrur, Albert Andersson (missionary), Gustaf Ahlbert, Stina Mårtensson, John Törnquist, Gösta Raquette, Oskar Hermannson, the convert to Christianity Nur Luke, Harold Whitaker, and Turkologist Gunnar Jarring studied the Uyghur language and wrote works on it, calling it "Eastern Turki". Shaw wrote in his book that it was Europeans at his time who called the language "Uighur" while the native inhabitants of Yarkand and Kashgar did not call it by that name and but called it "Turki", and Shaw wrote that the name "Uighur" was a misnomer when referring to Kashgar's language. A Turkish convert to Christianity, Johannes Avetaranian went to China to spread Christianity to the Uyghurs. Yaqup Istipan, Wu'erkaixi, and Alimujiang Yimiti are other Uyghurs who converted to Christianity.
The Bible was translated into the Kashgari dialect of Turki (Uyghur).
The historical term "Uyghur" was appropriated for the language that had been known as Eastern Turki by government officials in the Soviet Union in 1922 and in Xinjiang in 1934. Sergey Malov was behind the idea of renaming Turki to Uyghurs. The use of the term Uyghur has led to anachronisms when describing the history of the people. In one of his books the term Uyghur was deliberately not used by James Millward. The name Khāqāniyya was given to the Qarluks who inhabited Kāshghar and Bālāsāghūn, the inhabitants were not Uighur, but their language has been retroactively labelled as Uighur by scholars. The Qarakhanids called their own language the "Turk" or "Kashgar" language, and did not use Uighur to describe their own language, Uighur was used to describe the language of non-Muslims but Chinese scholars have anachronistically called a Qarakhanid work written by Kashgari as "Uighur". The name "Altishahri-Jungharian Uyghur" was used by the Soviet educated Uyghur Qadir Haji in 1927.
The Uyghur language belongs to the Karluk Turkic (Qarluq) branch of the Turkic language family. It is closely related to Äynu, Lop, Ili Turki, the extinct language Chagatay (the East Karluk languages), and more distantly to Uzbek (which is West Karluk).
Early linguistic scholarly studies of Uyghur include Julius Klaproth's 1812 Dissertation on language and script of the Uighurs (Abhandlung über die Sprache und Schrift der Uiguren) which was disputed by Isaak Jakob Schmidt. In this period, Klaproth correctly asserted that Uyghur was a Turkic language, while Schmidt believed that Uyghur should be classified with Tangut languages.
It is widely accepted that Uyghur has three main dialects, all based on their geographical distribution. Each of these main dialects have a number of sub-dialects which all are mutually intelligible to some extent.
The Central dialects are spoken by 90% of the Uyghur-speaking population, while the two other branches of dialects only are spoken by a relatively small minority.
Uyghur is spoken by about 8-11 million people in total. In addition to being spoken primarily in the Xinjiang Uyghur Autonomous Region of Western China, mainly by the Uyghur people, Uyghur was also spoken by some 300,000 people in Kazakhstan in 1993, some 90,000 in Kyrgyzstan and Uzbekistan in 1998, 3,000 in Afghanistan and 1,000 in Mongolia, both in 1982. Smaller communities also exist in Albania, Australia, Belgium, Canada, Germany, Indonesia, Pakistan, Saudi Arabia, Sweden, Taiwan, Tajikistan, Turkey, United Kingdom and the United States (New York City).
The Uyghurs are one of the 56 recognized ethnic groups in China, and Uyghur is an official language of Xinjiang Uyghur Autonomous Region, along with Standard Chinese. As a result, Uyghur can be heard in most social domains in Xinjiang, and also in schools, government and courts. Of the other ethnic minorities in Xinjiang, those populous enough to have their own autonomous prefectures, such as the Kazakhs and the Kyrgyz, have access to schools and government services in their native language. Smaller minorities, however, do not have a choice and must attend Uyghur-medium schools. These include the Xibe, Tajiks, Daurs, and Russians. In some instances Uyghur parents decide to enroll their children at Mandarin schools over Uyghur schools because of the better quality education offered, leading to many Uyghur children having more trouble learning their native language over Mandarin.
The vowels of the Uyghur language are, in their alphabetical order (in the Latin script), ⟨a⟩, ⟨e⟩, ⟨ë⟩, ⟨i⟩, ⟨o⟩, ⟨ö⟩, ⟨u⟩, ⟨ü⟩. There are no diphthongs in Uyghur and when two vowels come together, which occurs in some loanwords, each vowel retains its individual sound. And disregarding vowel length distinction in current Uyghur orthographies.
The Uyghur vowel system is characterised by the oppositions front vs. back, high vs. low and unrounded vs. rounded.
The Uyghur vowel system may be subcategorized on the basis of height, backness and roundness. It has been argued, within a lexical phonology framework, that /e/ has a back counterpart /ɤ/, and modern Uyghur lacks a clear differentiation between /i/ and /ɯ/.
|Close||ɪ, i||y, ʏ||(ɨ), (ɯ)||ʊ, u|
|Open||ɛ, æ||ø||ʌ, ɑ||o, ɔ|
Uyghur vowels are by default short, but some phonologists have argued that long vowels also exist because of historical vowel assimilation (above) and through loanwords. Underlyingly long vowels would resist vowel reduction and devoicing, introduce non-final stress, and be analyzed as |Vj| or |Vr| before a few suffixes. However, the conditions in which they are actually pronounced as distinct from their short counterparts have not been fully researched.
The high vowels undergo some tensing when they occur adjacent to alveolars (s, z, r, l), palatals (j), dentals (t̪, d̪, n̪), and post-alveolar affricates (t͡ʃ, d͡ʒ), e.g. chiraq [t͡ʃʰˈiraq] 'lamp', jenubiy [d͡ʒɛnʊˈbiː] 'southern', yüz [jyz] 'face; hundred', suda [suːˈda] 'in/at (the) water'.
Both [i] and [ɯ] undergo apicalisation after alveodental continuants in unstressed syllables, e.g. siler [sɪ̯læː(r)] 'you (plural)', ziyan [zɪ̯ˈjɑːn] 'harm'. They are medialised after /χ/ or before /l/, e.g. til [tʰɨl] 'tongue', xizmet [χɨzˈmɛt] 'work; job; service'. After velars, uvulars and /f/ they are realised as [e], e.g. giram [ɡeˈrʌm] 'gramme', xelqi [χɛlˈqʰe] 'his [etc.] nation', Finn [fen] 'Finn'. Between two syllables that contain a rounded back vowel each, they are realised as back, e.g. qolimu [qʰɔˈlɯmʊ] 'also his [etc.] arm'.
Any vowel undergoes laxing and backing when it occurs in uvular (/q/, /ʁ/, /χ/) and laryngeal (glottal) (/ɦ/, /ʔ/) environments, e.g. qiz [qʰɤz] 'girl', qëtiq [qʰɤˈtɯq] 'yogurt', qeghez [qʰæˈʁæz] 'paper', qum [qʰʊm] 'sand', qolay [qʰɔˈlʌɪ] 'convenient', qan [qʰɑn] 'blood', ëghiz [ʔeˈʁez] 'mouth', hisab [ɦɤˈsʌp] 'number', hës [ɦɤs] 'hunch', hemrah [ɦæmˈrʌh] 'partner', höl [ɦœɫ] 'wet', hujum [ɦuˈd͡ʒʊm] 'assault', halqa [ɦɑlˈqʰɑ] 'ring'.
Lowering tends to apply to the non-high vowels when a syllable-final liquid assimilates to them, e.g. kör [cʰøː] 'look!', boldi [bɔlˈdɪ] 'he [etc.] became', ders [dæːs] 'lesson', tar [tʰɑː(r)] 'narrow'.
Official Uyghur orthographies do not mark vowel length, and also do not distinguish between /ɪ/ (e.g., بىلىم /bɪlɪm/ 'knowledge') and back /ɯ/ (e.g., تىلىم /tɯlɯm/ 'my language'); these two sounds are in complementary distribution, but phonological analyses claim that they play a role in vowel harmony and are separate phonemes. /e/ only occurs in words of non-Turkic origin and as the result of vowel raising.
Uyghur has systematic vowel reduction (or vowel raising) as well as vowel harmony. Words usually agree in vowel backness, but compounds, loans, and some other exceptions often break vowel harmony. Suffixes surface with the rightmost [back] value in the stem, and /e, ɪ/ are transparent (as they do not contrast for backness). Uyghur also has rounding harmony.
Uyghur voiceless stops are aspirated word-initially and intervocalically. The pairs /p, b/, /t, d/, /k, ɡ/, and /q, ʁ/ alternate, with the voiced member devoicing in syllable-final position, except in word-initial syllables. This devoicing process is usually reflected in the official orthography, but an exception has been recently made for certain Perso-Arabic loans. Voiceless phonemes do not become voiced in standard Uyghur.
Suffixes display a slightly different type of consonant alternation. The phonemes /ɡ/ and /ʁ/ anywhere in a suffix alternate as governed by vowel harmony, where /ɡ/ occurs with front vowels and /ʁ/ with back ones. Devoicing of a suffix-initial consonant can occur only in the cases of /d/ → [t], /ɡ/ → [k], and /ʁ/ → [q], when the preceding consonant is voiceless. Lastly, the rule that /g/ must occur with front vowels and /ʁ/ with back vowels can be broken when either [k] or [q] in suffix-initial position becomes assimilated by the other due to the preceding consonant being such.
Loan phonemes have influenced Uyghur to various degrees. /d͡ʒ/ and /χ/ were borrowed from Arabic and have been nativized, while /ʒ/ from Persian less so. /f/ only exists in very recent Russian and Chinese loans, since Perso-Arabic (and older Russian and Chinese) /f/ became Uyghur /p/. Perso-Arabic loans have also made the contrast between /k, ɡ/ and /q, ʁ/ phonemic, as they occur as allophones in native words, the former set near front vowels and the latter near a back vowels. Some speakers of Uyghur distinguish /v/ from /w/ in Russian loans, but this is not represented in most orthographies. Other phonemes occur natively only in limited contexts, i.e. /h/ only in few interjections, /d/, /ɡ/, and /ʁ/ rarely initially, and /z/ only morpheme-final. Therefore, the pairs */t͡ʃ, d͡ʒ/, */ʃ, ʒ/, and */s, z/ do not alternate.
The primary syllable structure of Uyghur is CV(C)(C). Uyghur syllable structure is usually CV or CVC, but CVCC can also occur in some words. When syllable-coda clusters occur, CC tends to become CVC in some speakers especially if the first consonant is not a sonorant. In Uyghur, any consonant phoneme can occur as the syllable onset or coda, except for /ʔ/ which only occurs in the onset and /ŋ/, which never occurs word-initially. In general, Uyghur phonology tends to simplify phonemic consonant clusters by means of elision and epenthesis.
The Karluk language started to be written with the Perso-Arabic script (Kona Yëziq) in the 10th century upon the conversion of the Kara-Khanids to Islam. This Perso-Arabic script (Kona Yëziq) was reformed in the 20th century with modifications to represent all Modern Uyghur sounds including short vowels and eliminate Arabic letters representing sounds not found in Modern Uyghur. Unlike many other modern Turkic languages, Uyghur is primarily written using an Arabic alphabet, (with 4 alphabets like che-Pe-Zhe and Ga) although a Cyrillic alphabet and two Latin alphabets also are in use to a much lesser extent. Unusually for an alphabet based on the Persian, full transcription of vowels is indicated. (Among the Arabic family of alphabets, only a few, such as Kurdish, distinguish all vowels.)
The four alphabets in use today can be seen below.
|1||/ɑ/||ئا||А а||A a||17||/q/||ق||Қ қ||Ⱪ ⱪ||Q q|
|2||/ɛ/ ~ /æ/||ئە||Ә ә||Ə ə||E e||18||/k/||ك||К к||K k|
|3||/b/||ب||Б б||B b||19||/ɡ/||گ||Г г||G g|
|4||/p/||پ||П п||P p||20||/ŋ/||ڭ||Ң ң||Ng ng|
|5||/t/||ت||Т т||T t||21||/l/||ل||Л л||L l|
|6||/dʒ/||ج||Җ җ||J j||22||/m/||م||М м||M m|
|7||/tʃ/||چ||Ч ч||Q q||Ch ch||23||/n/||ن||Н н||N n|
|8||/χ/||خ||Х х||H h||X x||24||/h/||ھ||Һ һ||Ⱨ ⱨ||H h|
|9||/d/||د||Д д||D d||25||/o/||ئو||О о||O o|
|10||/r/||ر||Р р||R r||26||/u/||ئۇ||У у||U u|
|11||/z/||ز||З з||Z z||27||/ø/||ئۆ||Ө ө||Ɵ ɵ||Ö ö|
|12||/ʒ/||ژ||Ж ж||Ⱬ ⱬ||Zh zh||28||/y/||ئۈ||Ү ү||Ü ü|
|13||/s/||س||С с||S s||29||/v/~/w/||ۋ||В в||V v||W w|
|14||/ʃ/||ش||Ш ш||X x||Sh sh||30||/e/||ئې||Е е||E e||Ë ë|
|15||/ʁ/||غ||Ғ ғ||Ƣ ƣ||Gh gh||31||/ɪ/ ~ /i/||ئى||И и||I i|
|16||/f/||ف||Ф ф||F f||32||/j/||ي||Й й||Y y|
Uyghur is an agglutinative language with a subject–object–verb word order. Nouns are inflected for number and case, but not gender and definiteness like in many other languages. There are two numbers: singular and plural; and six different cases: nominative, accusative, dative, locative, ablative and genitive. Verbs are conjugated for tense: present and past; voice: causative and passive; aspect: continuous; and mood: e.g. ability. Verbs may be negated as well.
The core lexicon of the Uyghur language is of Turkic stock, but due to different kinds of language contact through the history of the language, it has adopted many loanwords. Kazakh, Uzbek and Chagatai are all Turkic languages which have had a strong influence on Uyghur. Many words of Arabic origin have come into the language through Persian and Tajik, which again have come through Uzbek, and to a greater extent, Chagatai. Many words of Arabic origin have also entered the language directly through Islamic literature after the introduction of the Islamic religion around the 10th century.
Chinese in Xinjiang and Russian elsewhere had the greatest influence on Uyghur. Loanwords from these languages are all quite recent, although older borrowings exist as well, such as borrowings from Dungan, a Mandarin language spoken by the Dungan people of Central Asia. A number of loanwords of German origin have also reached Uyghur through Russian.
Below are some examples of loanwords which have entered the Uyghur language.
|Origin||Source word||Source (in IPA)||Uyghur word||Uyghur (in IPA)||English|
|Arabic||ساعة||/ˈsaːʕat/ (genitive case)||saet سائەت||/saʔɛt/||hour|
|доктор||[ˈdoktər]||doxtur دوختۇر||/doχtur/||doctor (medical)|
|область||[ˈobləsʲtʲ]||oblast ئوبلاست||/oblast/||oblast, region|
|телевизор||[tʲɪlʲɪˈvʲizər]||tëlëwizor تېلېۋىزور||/televizor/||television set|
|Chinese||凉粉 liángfěn||[li̯ɑŋ˧˥fən˨˩]||lempung لەڭپۇڭ||/lɛmpuŋ/||agar-agar jelly|
|豆腐 dòufu||[tou̯˥˩fu˩]||dufu دۇفۇ||/dufu/||bean curd/tofu|