Tajik alphabet

The Tajik language has been written in three alphabets over the course of its history: an adaptation of the Perso-Arabic script (specifically the Persian alphabet), an adaptation of the Latin script, and an adaptation of the Cyrillic script. Any script used specifically for Tajik may be referred to as the Tajik alphabet, which is written as алифбои тоҷикӣ in Cyrillic characters, الفبای تاجیکی‎ with Perso-Arabic script, and alifboji toçikī in Latin script.

The use of a specific alphabet generally corresponds with stages in history, with Arabic being used first, followed by Latin for a short period and then Cyrillic, which remains the most widely used alphabet in Tajikistan. The Bukhori dialect spoken by Bukharan Jews traditionally used the Hebrew alphabet but more often today is written using the Cyrillic variant.

Coat of Arms of Tajik ASSR 04.1929-24.02.1931
The coat of arms of the Tajik Autonomous Soviet Socialist Republic circa 1929. "Tajik Autonomous Soviet Socialist Republic" is written (from top to bottom) in Tajik Latin, Tajik Arabic, and Russian Cyrillic.
Coat of Arms of Tajik ASSR
Another version of the 1929 coat of arms without Tajik Latin. The Tajik Arabic reads جمهوریت اجتماعی شوروی مختار تاجیکستان

Political context

As with many post-Soviet states, the change in writing system and the debates surrounding it is closely intertwined with political themes. Although not having been used since the adoption of Cyrillic, the Latin script is supported by those who wish to bring the country closer to Uzbekistan, which has adopted the Latin-based Uzbek alphabet.[1] The Persian alphabet is supported by the devoutly religious, Islamists, and by those who wish to bring the country closer to Iran and their Persian heritage. As the de facto standard, the Cyrillic alphabet is generally supported by those who wish to maintain the status quo, and not distance the country from Russia.


As a result of the influence of Islam in the region, Tajik was written in the Persian alphabet up to the 1920s. Until this time, the language was not thought of as separate and simply considered a dialect of the Persian language. The Soviets began by simplifying the Persian alphabet in 1923, before moving to a Latin-based system in 1927.[2] The Latin script was introduced by the Soviet Union as part of an effort to increase literacy and distance the, at that time, largely illiterate population, from the Islamic Central Asia. There were also practical considerations. The regular Persian alphabet, being an abjad, does not provide sufficient letters for representing the vowel system of Tajik. In addition, the abjad is more difficult to learn, each letter having different forms depending on the position in the word.[3]

The Decree on Romanisation made this law in April, 1928.[4] The Latin variant for Tajik was based on the work by Turcophone scholars who aimed to produce a unified Turkic alphabet,[5] despite Tajik not being a Turkic language. The literacy campaign was successful, with near-universal literacy being achieved by the 1950s.

As part of the "russification" of Central Asia, the Cyrillic script was introduced in the late 1930s.[1][2][3][4][5] The alphabet remained Cyrillic until the end of the 1980s with the disintegration of the Soviet Union. In 1989, with the growth in Tajik nationalism, a law was enacted declaring Tajik the state language. In addition, the law officially equated Tajik with Persian, placing the word Farsi (the endonym for the Persian language) after Tajik. The law also called for a gradual reintroduction of the Perso-Arabic alphabet.[6][7][8][9][10][11][12][13][14][15][16][17]

The Persian alphabet was introduced into education and public life, although the banning of the Islamic Renaissance Party in 1993 slowed down the adoption. In 1999, the word Farsi was removed from the state-language law.[6] As of 2004 the de facto standard in use is a Cyrillic alphabet,[7] and as of 1996 only a very small part of the population can read the Persian alphabet.[8]


The letters of the major versions of the Tajik alphabet are presented below, along with their phonetic values. There is also a comparative table below.

Persian alphabet

A variant of the Persian alphabet (technically an abjad) is used to write Tajik. In the Tajik version, as with all other versions of the Arabic script, with the exception of ا‎ (alef), vowels are not given unique letters, but rather optionally indicated with diacritic marks.

The Tajik alphabet in Persian
ذ د خ ح چ ج ث ت پ ب ا
/z/ /d/ /χ/ /h/ /tʃ/ /dʒ/ /s/ /t/ /p/ /b/ /ɔː/
غ ع ظ ط ض ص ش س ژ ز ر
/ʁ/ /ʔ/ /z/ /t/ /z/ /s/ /ʃ/ /s/ /ʒ/ /z/ /ɾ/
ی ه و ن م ل گ ک ف ق
/j/ /h/ /v/ /n/ /m/ /l/ /ɡ/ /k/ /f/ /q/


Nasimi isfara 3 mod
The front page of Kommunisti Isfara from 15 May 1936.

The Latin script was introduced after the Russian Revolution of 1917 in order to facilitate an increase in literacy and distance the language from Islamic influence. Only lowercase letters were found in the first versions of the Latin variant, between 1926-9. A slightly different version used by Jews speaking the Bukhori dialect, who included three extra characters for phonemes not found in the other dialects: ů, ə̧, and .[9] (Note that c and ç are switched relative to their usage in the Turkish alphabet, which has formed the basis for other Latin scripts in the former Soviet Union.)

The Tajik alphabet in Latin
A a B ʙ C c Ç ç D d E e F f G g Ƣ ƣ H h I i
/æ/ /b/ /tʃ/ /dʒ/ /d/ /eː/ /f/ /ɡ/ /ʁ/ /h/ /i/
Ī ī J j K k L l M m N n O o P p Q q R r S s
/ˈi/ /j/ /k/ /l/ /m/ /n/ /ɔː/ /p/ /q/ /ɾ/ /s/
Ş ş T t U u Ū ū V v X x Z z Ƶ ƶ ʼ
/ʃ/ /t/ /u/ /ɵː/ /v/ /χ/ /z/ /ʒ/ /ʔ/

The unusual character Ƣ is called Gha and represents the phoneme /ɣ/. The character is found in the Common Turkic Alphabet in which most non-Slavic languages of the Soviet Union were written until the late 1930s. The Latin alphabet is not used today, although its adoption is advocated by certain groups.[10]


The Cyrillic script was introduced in Tajik Soviet Socialist Republic in the late 1930s, replacing the Latin script that had been used since the October Revolution. After 1939, materials published in Persian in the Persian alphabet were banned from the country.[11] The alphabet below was supplemented by the letters Щ and Ы in 1952.

Tajik rouble reverse detail
Text detail from the reverse of the 1 ruble note. The ruble was replaced in 2000 as a result of increasing inflation.
The Tajik alphabet in Cyrillic
А а Б б В в Г г Ғ ғ Д д Е е Ё ё Ж ж З з И и Ӣ ӣ
/æ/ /b/ /v/ /ɡ/ /ʁ/ /d/ /eː/ /jɔː/ /ʒ/ /z/ /i/ /ˈi/
Й й К к Қ қ Л л М м Н н О о П п Р р С с Т т У у
/j/ /k/ /q/ /l/ /m/ /n/ /ɔː/ /p/ /ɾ/ /s/ /t/ /u/
Ӯ ӯ Ф ф Х х Ҳ ҳ Ч ч Ҷ ҷ Ш ш Ъ ъ Э э Ю ю Я я
/ɵː/ /f/ /χ/ /h/ /tʃ/ /dʒ/ /ʃ/ /ʔ/ /eː/ /ju/ /jæ/

In addition to these thirty-five letters, the letters ц, щ, and ы can be found in loanwords, although they were officially dropped in the 1998 reform, along with the letter ь. Along with the deprecation of these letters, the 1998 reform also changed the order of the alphabet, which now has the characters with diacritics following their unaltered partners, e.g. г, ғ and к, қ etc.[12] leading to the present order: а б в г ғ д е ё ж з и ӣ й к қ л м н о п р с т у ӯ ф х ҳ ч ҷ ш ъ э ю я. In 2010 it was suggested that the letters е ё ю я might be dropped as well. [13] The letters е and э have the same function, except that э is used at the beginning of a word (ex. Эрон, "Iran").

The alphabet includes a number of letters not found in the Russian alphabet:

Description Г with bar И with macron К with descender У with macron Х with descender Ч with descender
Letter Ғ Ӣ Қ Ӯ Ҳ Ҷ
Phoneme /ʁ/ /ˈi/ /q/ /ɵː/ /h/ /dʒ/

During the period when the Cyrillicization took place, Ӷ ӷ also appeared a few times in the table of the Tajik Cyrillic alphabet.[14]

Transliteration standards

The transliteration standards for the Tajik alphabet in Cyrillic into the Latin alphabet are as follows:

Cyrillic IPA ISO 9 (1995) 1 KNAB (1981) 2 WWS (1996) 3 ALA-LC 4 Allworth 5 BGN/PCGN 6
А а /æ/ a a a a a a
Б б /b/ b b b b b b
В в /v/ v v v v v v
Г г /ɡ/ g g g g g g
Ғ ғ /ʁ/ ġ gh gh gh gh
Д д /d/ d d d d d d
Е е /jeː, eː/ e e, ye e e ye‐, ‐e‐ e
Ё ё /jɔː/ ë yo ë ë yo yo
Ж ж /ʒ/ ž zh zh ž zh zh
З з /z/ z z z z z z
И и /i/ i i i i i i
Ӣ ӣ /i/ ī ī ī ī ī í
Й й /j/ j y ĭ j y y
К к /k/ k k k k k k
Қ қ /q/ ķ q q ķ q q
Л л /l/ l l l l l l
М м /m/ m m m m m m
Н н /n/ n n n n n n
О о /ɔː/ o o o o o o
П п /p/ p p p p p p
Р р /r/ r r r r r r
С с /s/ s s s s s s
Т т /t/ t t t t t t
У у /u/ u u u u u u
Ӯ ӯ /ɵː/ ū ū ū ū ū ŭ
Ф ф /f/ f f f f f f
Х х /χ/ h kh kh x kh kh
Ҳ ҳ /h/ h x h h
Ч ч /tʃ/ č ch ch č ch ch
Ҷ ҷ /dʒ/ ç j j č̦ j j
Ш ш /ʃ/ š sh sh š sh sh
Ъ ъ /ʔ/ ' ' ' ' " '
Э э /eː/ è è, e ė è e ė
Ю ю /ju/ û yu i͡u ju yu yu
Я я /jæ/ â ya i͡a ja ya ya

Notes to the table above:

  1. ISO 9 — The International Organization for Standardization ISO 9 specification.
  2. KNAB — From the placenames database of the Institute of the Estonian Language.
  3. WWS — From World’s Writing Systems, Bernard Comrie (ed.)
  4. ALA-LC — The standard of the Library of Congress and the American Library Association.
  5. Edward Allworth, ed. Nationalities of the Soviet East. Publications and Writing Systems (NY: Columbia University Press, 1971)
  6. BGN/PCGN — The standard of the United States Board on Geographic Names and the Permanent Committee on Geographical Names for British Official Use.


The Hebrew alphabet is, like the Persian alphabet, an abjad. It is used for the Jewish Bukhori dialect primarily in Samarkand and Bukhara.[18][19] Additionally, since 1940, when Jewish schools were closed in Central Asia, the use of the Hebrew Alphabet outside Hebrew liturgy fell into disuse and Bukharian Jewish publications such as books and newspapers began to appear using the Tajik Cyrillic Alphabet. Today, many older Bukharian Jews who speak Bukharian and went to Tajik or Russian schools in Central Asia only know the Tajik Cyrillic Alphabet when reading and writing Bukharian and Tajik.

The Tajik alphabet in Hebrew
ג׳ צ׳ ץ׳ ג גּ בּ ב אֵי אִי אוּ אוֹ אָ אַ
/dʒ/ /tʃ/ /ʁ/ /ɡ/ /b/ /v/ /e/ /i/ /u/ /ɵ/ /ɔ/ /a/
מ ם ל כּ ךּ כ ך י ט ח ז׳ ז ו ה ד
/m/ /l/ /k/ /χ/ /j/ /t/ /ħ/ /ʒ/ /z/ /v/ /h/ /d/
ת שׁ ר ק צ ץ פּ ףּ פ ף ע ס נ ן
/t/ /ʃ/ /r/ /q/ /s/ /p/ /f/ /ʔ/ /s/ /n/

Sample text: דר מוקאבילי זולם איתיפאק נמאייד. מראם נאםה פרוגרמי פירקהי יאש בוכארייאן.‎ – Дар муқобили зулм иттифоқ намоед. Муромнома – пруграми фирқаи ёш бухориён.[15]


Tajik Cyrillic, Tajik Latin and Persian alphabet

Cyrillic Latin Persian Hebrew English Translation
Тамоми одамон озод ба дунё меоянд ва аз лиҳози манзилату ҳуқуқ бо ҳам баробаранд. Ҳама соҳиби ақлу виҷдонанд, бояд нисбат ба якдигар бародарвор муносабат намоянд. Tamomi odamon ozod ba dunjo meojand va az lihozi manzilatu huquq bo ham barobarand. Hama sohibi aqlu viçdonand, bojad nisbat ba jakdigar barodarvor munosabat namojand. تمام آدمان آزاد به دنیا می‌آیند و از لحاظ منزلت و حقوق با هم برابرند. همه صاحب عقل و وجدانند، باید نسبت به یکدیگر برادروار مناسبت نمایند. תמאם אדמאן אזאד בה דניא מיאינד ואז לחאז מנזלת וחקוק בא הם בראברנד. המה צאחב עקל וג׳דאננד، באיד נסבת בה יכדיגר בראדרואר מנאסבת נמאינד. All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.

For reference, the Persian script variant transliterated letter-for-letter into the Latin script appears as follows:

tmạm ậdmạn ậzạd bh dnyạ my̱ ậynd w ạz lḥạẓ mnzlt w ḥqwq bạ hm brạbrnd. hmh ṣḥb ʿql w wjdạnnd, bạyd nsbt bh ykdygr brạdrwạr mnạsbt nmạynd.

And the ISO 9 transliteration of the Cyrillic text:

Tamomi odamon ozod ba dunë meoând va az liḩozi manzilatu ḩuķuķ bo ḩam barobarand. Ḩama soḩibi aķlu viçdonand, boâd nisbat ba âkdigar barodarvor munosabat namoând.

Tajik Cyrillic and Persian alphabet

Vowel-pointed Persian includes the vowels that are not usually written.

Cyrillic vowel-pointed Persian Persian vowel-pointed Hebrew Hebrew
Баниодам аъзои як пайкаранд, ки дар офариниш зи як гавҳаранд. Чу узве ба дард оварад рӯзгор, дигар узвҳоро намонад қарор. Саъдӣ بَنی‌آدَم اَعضایِ یَک پَیکَرَند، که دَر آفَرینِش زِ یَک گَوهَرَند. چو عُضوی به دَرد آوَرَد روزگار، دِگَر عُضوها را نَمانَد قَرار.سعدی بنی‌آدم اعضای یک پیکرند، که در آفرینش ز یک گوهرند. چو عضوی به درد آورد روزگار، دگر عضوها را نماند قرار.سعدی בַּנִי־אָדַם אַעְזָאי יַךּ פַּיְכַּרַנְד, כִּה דַר אָפַרִינְשׁ זִ יַךּ גַוְהַרַנְד. צ׳וּ עֻזְוֵי בַּה דַרְד אָוַרַד רוֹזְגָאר דִגַּר עֻזְוְהָא רָא נַמָאיַנְד קַרָאר סַעְדִי. בני־אדם אעזאי יך פיכרנד, כה דר אפרינש ז יך גוהרנד. צ׳ו עזוי בה דרד אורד רוזגאר דגר עזוהא רא נמאינד קראר סעדי.
Мурда будам, зинда шудам; гиря будам, xанда шудам. Давлати ишқ омаду ман давлати поянда шудам. Мавлавӣ مُرده بُدَم، زِنده شُدَم؛ گِریه بُدَم، خَنده شُدَم. دَولَتِ عِشق آمَد و مَن دَولَتِ پایَنده شُدَم.مولوی مرده بدم، زنده شدم؛ گریه بدم، خنده شدم. دولت عشق آمد و من دولت پاینده شدم.مولوی מֻרְדַה בֻּדַם זִנְדַה שֻׁדַם; גִּרְיַה בֻּדַם, כַנְדַה שֻׁדַם. דַוְלַתִ עִשְק אָמַד וּמַן דַוְלַתִ פָּאיַנְדַה שֻׁדַם. מַוְלַוִי מרדה בדם זנדה שדם; גריה בדם, כנדה שדם. דולת עשק אמד ומן דולת פאינדה שדם. מולוי

Comparative table

Akademijai ilmxhoi jumxhurii tojikiston
Advertisement in Cyrillic for the admission of the graduate students by the research institutes of the Tajik Academy of Sciences.
Zenith thuraya
A biscriptal sign incorporating an English word, "Zenith", written in the Latin script, and Tajik written in Cyrillic.
Nasimi isfara dekabr 29
An illustration from Kommunisti Isfara, a newspaper published in Isfara in northern Tajikistan, inviting citizens to vote in an election on 29 December 1939. The word for "December", Dekabr is clearly visible.

A table comparing the different writing systems used for the Tajik alphabet. The Latin here is based on the 1929 standard, the Cyrillic on the revised 1998 standard, and Persian letters are given in their stand-alone forms.

Cyrillic Latin Persian Phonetic
value (IPA)
А а A a اَ، ـَ، ـَه /a/ санг = سنگ
Б б B b/ʙ /b/ барг = برگ
В в V v و /v/ номвар = نامور
Г г G g گ /ɡ/ санг = سنگ
Ғ ғ Ƣ ƣ /ʁ/ ғор = غار, Бағдод = بغداد
Д д D d /d/ модар = مادر, Бағдод = بغداد
Е е E e ای، ـی /e/ шер = شیر, меравам = می‌روم
Ё ё Jo jo یا /jɔ/ дарё = دریا, осиёб = آسیاب
Ж ж Ƶ ƶ ژ /ʒ/ жола = ژاله, каждум = کژدم
З з Z z ﺯ، ﺫ، ﺽ، ﻅ /z/ баъз = بعض, назар = نظر, заҳоб = ذهاب, замин = زمین
И и I i اِ، ـِ، ـِه؛ اِیـ، ـِیـ /i/ ихтиёр = اختیار
Ӣ ӣ Ī ī ـِی /ˈi/ зебоӣ = زیبائی
Й й J j ی /j/ май = می
К к K k ک /k/ кадом = کَدام
Қ қ Q q /q/ қадам = قدم
Л л L l /l/ лола = لاله
М м M m /m/ мурдагӣ = مردگی
Н н N n /n/ нон = نان
О о O o آ، ـا /ɔ/ орзу = آرزو
П п P p پ /p/ панҷ = پنج
Р р R r /ɾ/ ранг = رنگ
С с S s ﺱ، ﺙ، ﺹ /s/ сар = سر, субҳ = صبح, сурайё = ثریا
Т т T t ﺕ، ﻁ /t/ тоҷик = تاجیک, талаб = طلب
У у U u اُ، ـُ؛ اُو، ـُو /u/ дуд = دُود
Ӯ ӯ Ū ū او، ـو /ɵ/ хӯрдан = خوردن, ӯ = او
Ф ф F f /f/ фурӯғ = فروغ
Х х X x /χ/ хондан = خواندَن
Ҳ ҳ H h /h/ ҳофиз = حافظ
Ч ч C c چ /tʃ/ чӣ = چی
Ҷ ҷ Ç ç /dʒ/ ҷанг = جنگ
Ш ш Ş ş /ʃ/ шаб = شب
ъ ' ء; ﻉ /ʔ/ таъриф = تعریف
Э э E e ای، ـی /e/ Эрон = ایران
Ю ю Ju ju یُ, یُو /ju/ июн = ایون
Я я Ja ja یَ, یَه /ja/ ягонагӣ = یگانگی


  1. ^ Schlyter, B. N. (2003) Sociolinguistic Changes in Transformed Central Asian Societies
  2. ^ Keller, S. (2001) To Moscow, Not Mecca: The Soviet Campaign Against Islam in Central Asia, 1917-1941
  3. ^ Dickens, M. (1988) Soviet Language Policy in Central Asia
  4. ^ Khudonazar, A. (2004) "The Other" in Berkeley Program in Soviet and Post-Soviet Studies, 1 November 2004.
  5. ^ Perry, J. R. (2005) A Tajik Persian Reference Grammar (Boston : Brill) p. 34
  6. ^ Siddikzoda, S. "Tajik Language: Farsi or not Farsi?" in Media Insight Central Asia #27, August 2002
  7. ^ UNHCHR – Committee for the Elimination of Racial Discrimination – Summary Record of the 1659th Meeting : Tajikistan. 17 August 2004. CERD/C/SR.1659
  8. ^ Library of Congress Country Study – Tajikistan
  9. ^ Perry, J. R. (2005) A Tajik Persian Reference Grammar (Boston : Brill) p. 35
  10. ^ Schlyter, B. N. (2003) Sociolinguistic Changes in Transformed Central Asian Societies
  11. ^ Perry, J. R. (1996) "Tajik literature: Seventy years is longer than the millennium" in World Literature Today, Vol. 70 Issue 3, p. 571
  12. ^ Perry, J. R. (2005) A Tajik Persian Reference Grammar (Boston : Brill) p. 36
  13. ^ Судьба «русских букв» в таджикском алфавите будет решаться
  14. ^ Ido, S. (2005) Tajik (München : Lincom GmbH) p. 8
  15. ^ Rzehak, L. (2001) Vom Persischen zum Tadschikischen. Sprachliches Handeln und Sprachplanung in Transoxanien zwischen Tradition, Moderne und Sowjetunion (1900-1956) (Wiesbaden : Reichert)
  16. ^ IBM – International Components for Unicode – ICU Transform Demonstration


  1. ^ ed. Hämmerle 2008, p. 76.
  2. ^ Cavendish 2006, p. 656.
  3. ^ Landau & Kellner-Heinkele 2001, p. 125.
  4. ^ ed. Buyers 2003, p. 132.
  5. ^ Borjian 2005.
  6. ^ ed. Ehteshami 2002, p. 219.
  7. ^ ed. Malik 1996, p. 274.
  8. ^ Banuazizi & Weiner 1994, p. 33.
  9. ^ Westerlund & Svanberg 1999, p. 186.
  10. ^ ed. Gillespie & Henry 1995, p. 172.
  11. ^ Badan 2001, p. 137.
  12. ^ Winrow 1995, p. 47.
  13. ^ Parsons 1993, p. 8.
  14. ^ RFE/RL, inc, RFE/RL Research Institute 1990, p. 22.
  15. ^ Middle East Institute (Washington, D.C.) 1990, p. 10.
  16. ^ Ochsenwald & Fisher 2010, p. 416.
  17. ^ Gall 2009, p. 785.
  18. ^ Gitelman, Zvi Y (2001). A Century of Ambivalence: The Jews of Russia and the Soviet Union, 1881 to the Present. Indiana University Press. p. 203. ISBN 9780253214188.
  19. ^ Изд-во Академии наук СССР (1975). "Вопросы языкознания". Вопросы языкознания: 39.

See also

External links

Aimaq dialect

Aimaq or Aimaqi (Aimaq: ایماقی‎) is the dominant eastern Persian ethnolect spoken by the Aimaq people in central northwest Afghanistan (west of the Hazarajat), eastern Iran, and Tajikistan. It is close to the Khorasani and Dari varieties of Persian. The Aimaq people are thought to have a 5–15% literacy rate.

Dari language

Darī (Dari: دری‎ [daˈɾiː]) or Dari Persian (فارسی دری Fārsī-ye Darī [fɒːɾsije daˈɾiː]) or synonymously Farsi (فارسی Fārsī [fɒːɾsiː]) is a variation of the Persian language spoken in Afghanistan. Dari is the term officially recognized and promoted since 1964 by the Afghan government for the Persian language, hence, it is also known as Afghan Persian in many Western sources. This has resulted in a naming dispute. Many Persian speakers in Afghanistan prefer and use the name "Farsi" and say the term Dari has been forced on them by the dominant Pashtun ethnic group as an attempt to distance Afghans from their cultural, linguistic, and historical ties to the Persian-speaking nations, which includes Iran and Tajikistan.As defined in the Constitution of Afghanistan, it is one of the two official languages of Afghanistan; the other is Pashto. Dari is the most widely spoken language in Afghanistan and the native language of approximately 25-50% of the population, serving as the country's lingua franca. The Iranian and Afghan types of Persian are mutually intelligible, with differences found primarily in the vocabulary and phonology.

By way of Early New Persian, Dari Persian, like Iranian Persian and Tajik, is a continuation of Middle Persian, the official religious and literary language of the Sassanian Empire (224–651 CE), itself a continuation of Old Persian, the language of the Achaemenids (550–330 BC). In historical usage, Dari refers to the Middle Persian court language of the Sassanids.

Hazaragi dialect

Hazaragi (Persian: هزارگی‎, Hazaragi: آزرگی‎, Azaragi) is an eastern variety of Persian that is spoken by the Hazara people, primarily in the Hazarajat region of central Afghanistan, as well as other Hazara-populated areas of their native living ground of Afghanistan. It is also spoken by the Hazaras of Pakistan and Iran and also by Hazara diaspora living elsewhere. It is mutually intelligible with Dari, one of the two official languages of Afghanistan.

Latinisation in the Soviet Union

In the USSR, latinisation (Russian: латиниза́ция latinizatsiya) was the name of the campaign during the 1920s–1930s which aimed to replace traditional writing systems for all languages of the Soviet Union with systems that would use the Latin script or to create Latin-script based systems for languages that, at the time, did not have a writing system.

Persian language

Persian (), also known by its endonym Farsi (فارسی, fārsi, [fɒːɾˈsiː] (listen)), is one of the Western Iranian languages within the Indo-Iranian branch of the Indo-European language family. It is a pluricentric language primarily spoken in Iran, Afghanistan (officially known as Dari since 1958) and Tajikistan (officially known as Tajiki since the Soviet era), Uzbekistan and some other regions which historically were Persianate societies and considered part of Greater Iran. It is written right to left in the Persian alphabet, a modified variant of the Arabic script.

The Persian language is classified as a continuation of Middle Persian, the official religious and literary language of the Sasanian Empire, itself a continuation of Old Persian, the language of the Achaemenid Empire. Its grammar is similar to that of many contemporary European languages. A Persian-speaking person may be referred to as Persophone.There are approximately 110 million Persian speakers worldwide, with the language holding official status in Iran, Afghanistan, and Tajikistan. For centuries, Persian has also been a prestigious cultural language in other regions of Western Asia, Central Asia, and South Asia by the various empires based in the regions.Persian has had a considerable (mainly lexical) influence on neighboring languages, particularly the Turkic languages in Central Asia, Caucasus, and Anatolia, neighboring Iranian languages, as well as Armenian, Georgian, and Indo-Aryan languages, especially Urdu (a register of Hindustani). It also exerted some influence on Arabic, particularly Bahrani Arabic, while borrowing much vocabulary from it after the Arab conquest of Iran.With a long history of literature in the form of Middle Persian before Islam, Persian was the first language in the Muslim world to break through Arabic's monopoly on writing, and the writing of poetry in Persian was established as a court tradition in many eastern courts. Some of the famous works of Persian literature are the Shahnameh of Ferdowsi, the works of Rumi, the Rubaiyat of Omar Khayyam, the Panj Ganj of Nizami Ganjavi, the Divān of Hafez and the two miscellanea of prose and verse by Saadi Shirazi, the Gulistan and the Bustan.

Persian phonology

The Persian language has between six and eight vowel phonemes and twenty-three consonant phonemes. It features contrastive stress and syllable-final consonant clusters.

Romanization of Persian

Romanization of Persian or Latinization of Persian is the representation of the Persian language (Farsi, Dari and Tajik) with the Latin script. Several different romanization schemes exist, each with its own set of rules driven by its own set of ideological goals.


Samarkand (; Uzbek language: Samarqand; Persian: سمرقند‎; Russian: Самарканд), alternatively Samarqand, is a city in southeastern Uzbekistan and one of the oldest continuously inhabited cities in Central Asia. There is evidence of human activity in the area of the city from the late Paleolithic era, though there is no direct evidence of when Samarkand was founded; some theories propose that it was founded between the 8th and 7th centuries BC. Prospering from its location on the Silk Road between China and the Mediterranean, at times Samarkand was one of the greatest cities of Central Asia.By the time of the Achaemenid Empire of Persia, it was the capital of the Sogdian satrapy. The city was taken by Alexander the Great in 329 BC, when it was known by its Greek name of Marakanda. The city was ruled by a succession of Iranian and Turkic rulers until the Mongols under Genghis Khan conquered Samarkand in 1220. Today, Samarkand is the capital of Samarqand Region and Uzbekistan's second largest city.The city is noted for being an Islamic centre for scholarly study. In the 14th century it became the capital of the empire of Timur (Tamerlane) and is the site of his mausoleum (the Gur-e Amir). The Bibi-Khanym Mosque, rebuilt during the Soviet era, remains one of the city's most notable landmarks. Samarkand’s Registan square was the ancient centre of the city, and is bound by three monumental religious buildings. The city has carefully preserved the traditions of ancient crafts: embroidery, gold embroidery, silk weaving, engraving on copper, ceramics, carving and painting on wood. In 2001, UNESCO added the city to its World Heritage List as Samarkand – Crossroads of Cultures.

Modern-day Samarkand is divided into two parts: the old city, and the new city developed during the days of the Russian Empire and Soviet Union. The old city includes historical monuments, shops and old private houses, while the new city includes administrative buildings along with cultural centres and educational institutions.

Tajik grammar

This article describes the grammar of the standard Tajik language as spoken and written in Tajikistan. In general, the grammar of the Tajik language fits the analytical type. Little remains of the case system, and grammatical relationships are primarily expressed via clitics, word order and other analytical constructions. Like other modern varieties of Persian, Tajik grammar is almost identical to the classic Persian grammar, although there are differences in some verb tenses.

Tajik language

Tajik or Tajiki (Tajik: забо́ни тоҷикӣ́, zaboni tojikī [zaˈbɔni tɔdʒiˈki]), or synonymously Farsi also called Tajiki Persian (Tajik: форси́и тоҷикӣ́, forsii tojikī, [fɔrˈsiji tɔdʒiˈki]), is the variety of Persian spoken in Tajikistan and Uzbekistan and it is closely related to Dari Persian. Since the beginning of the twentieth century and independence of Tajikistan from Soviet Union, Tajik has been considered by a number of writers and researchers to be a variety of Persian (Halimov 1974: 30–31, Oafforov 1979: 33). The popularity of this conception of Tajik as a variety of Persian was such that, during the period in which Tajik intellectuals were trying to establish Tajik as a language separate from Persian language, Sadriddin Ayni, who was a prominent intellectual and educator, made a statement that Tajik was not a bastardized dialect of Persian. The issue of whether Tajik and Persian are to be considered two dialects of a single language or two discrete languages has political sides to it (see Perry 1996).Tajik is the official language of Tajikistan. In Afghanistan (where the Tajik people minority forms the principal part of the wider Persophone population), this language is less influenced by Turkic languages, is regarded as a form of Dari, and as such has co-official language status. The Tajik of Tajikistan has diverged from Persian as spoken in Afghanistan and Iran due to political borders, geographical isolation, the standardization process, and the influence of Russian and neighboring Turkic languages. The standard language is based on the northwestern dialects of Tajik (region of old major city of Samarqand), which have been somewhat influenced by the neighboring Uzbek language as a result of geographical proximity. Tajik also retains numerous archaic elements in its vocabulary, pronunciation, and grammar that have been lost elsewhere in the Persophone world, in part due to its relative isolation in the mountains of Central Asia.


Tajiks (Persian: تاجيک‎: Tājīk, Tajik: Тоҷик) are a Persian-speaking Iranian ethnic group native to Afghanistan, Tajikistan, and Uzbekistan. Tajiks are the largest ethnicity in Tajikistan, and the second largest in Afghanistan which constitutes over half of the global Tajik population. They speak varieties of Persian, a Western Iranian language. In Tajikistan, since the 1939 Soviet census, its small Pamiri and Yaghnobi ethnic groups are included as Tajiks. In China, the term is used to refer to its Pamiri ethnic groups, the Tajiks of Xinjiang, who speak the Eastern Iranian Pamiri languages. In Afghanistan, the Pamiris are counted as a separate ethnic group.As a self-designation, the literary New Persian term Tajik, which originally had some previous pejorative usage as a label for eastern Persians or Iranians, has become acceptable during the last several decades, particularly as a result of Soviet administration in Central Asia. Alternative names for the Tajiks are Eastern Persian, Fārsīwān (Persian-speaker), and Dīhgān (cf. Tajik: Деҳқон) which translates to "farmer or settled villager", in a wider sense "settled" in contrast to "nomadic" and was later used to describe a class of land-owning magnates as "Persian of noble blood" in contrast to Arabs, Turks and Romans during the Sassanid and early Islamic period.

Yaghnobi language

The Yaghnobi language is a living Eastern Iranian language (the other living members being Pashto, Ossetic and the Pamir languages). Yaghnobi is spoken in the upper valley of the Yaghnob River in the Zarafshan area of Tajikistan by the Yaghnobi people. It is considered to be a direct descendant of Sogdian and has often been called Neo-Sogdian in academic literature.There are some 12,500 Yaghnobi speakers. They are divided into several communities. The principal group lives in the Zafarobod area. There are also resettlers in the Yaghnob Valley. Some communities live in the villages of Zumand and Kůkteppa and in Dushanbe or in its vicinity.

Most Yaghnobi speakers are bilingual in the West Iranian Tajik. Yaghnobi is mostly used for daily family communication, and Tajik is used by Yaghnobi-speakers for business and formal transactions. A single Russian ethnographer was told by nearby Tajiks, long hostile to the Yaghnobis, who were late to adopt Islam, that the Yaghnobis used their language as a "secret" mode of communication to confuse the Tajiks. The account led to the belief by some, especially those reliant solely on Russian sources, that Yaghnobi or some derivative of it was used as a code for nefarious purposes.There are two main dialects: a western and an eastern one. They differ primarily in phonetics. For example, historical *θ corresponds to t in the western dialects and s in the eastern: met – mes 'day' from Sogdian mēθ ⟨myθ⟩. Western ay corresponds to Eastern e: wayš – weš 'grass' from Sogdian wayš or wēš ⟨wyš⟩. The early Sogdian group θr (later ṣ̌) is reflected as sar in the east but tir in the west: saráy – tiráy 'three' from Sogdian θrē/θray or ṣ̌ē/ṣ̌ay ⟨δry⟩. There are also some differences in verbal endings and in the lexicon. In between the two main dialects is a transitional dialect that shares some features of both other dialects.

Language features
Writing system
Other topics

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.