Languages of Asia

There is a wide variety of languages spoken throughout Asia, comprising different language families and some unrelated isolates. The major language families spoken on the continent include Altaic, Austroasiatic, Austronesian, Caucasian, Dravidian, Indo-Aryan, Indo-European, Afroasiatic, Siberian, Sino-Tibetan and Tai-Kadai. They usually have a long tradition of writing, but not always.

Language families of Asia
Of the many language families of Asia, Indo-European (purple, blue, and medium green) and Sino-Tibetan (chartreuse and pink) dominate numerically, while Altaic families (grey, bright green, and maroon) occupy large areas geographically. Indo-Aryan_language family is Sindhi in Pakistan and India. Regionally dominant families are Japonic in Japan, Austronesian in the Malay Archipelago (dark red), Kadai and Mon–Khmer in Southeast Asia (azure and peach), Dravidian in South India (khaki), Turkic in Central Asia (grey), and Semitic in the Mideast (orange).

Language groups

Ethnolinguistic distribution in Central/Southwest Asia of the Altaic, Caucasian, Afroasiatic (Hamito-Semitic) and Indo-European families.

The major families in terms of numbers are Indo-European and Indo-Aryan Languages and Dravidian languages in South Asia and Sino-Tibetan in East Asia. Several other families are regionally dominant.


Sino-Tibetan includes Chinese, Tibetan, Burmese, Karen and numerous languages of the Tibetan Plateau, southern China, Burma, and North east India.


The Indo-European languages are primarily represented by the Indo-Iranian branch. The family includes both Indic languages (Hindi, Urdu, Bengali, Punjabi, Sindhi, Kashmiri, Marathi, Gujarati, Sinhalese and other languages spoken primarily in South Asia) and Iranian (Persian, Kurdish, Pashto, Balochi and other languages spoken primarily in Iran, Anatolia, Mesopotamia, Central Asia, the Caucasus and parts of South Asia). In addition, other branches of Indo-European spoken in Asia include the Slavic branch, which includes Russian in Siberia; Greek around the Black Sea; and Armenian; as well as extinct languages such as Hittite of Anatolia and Tocharian of (Chinese) Turkestan.

Altaic families

A number of smaller, but important language families spread across central and northern Asia have long been linked in an as-yet unproven Altaic family. These are the Turkic, Mongolic, Tungusic (including Manchu), Koreanic, and Japonic languages. Speakers of the Turkish language (Anatolian Turks) are believed to have adopted the language, having instead originally spoken the Anatolian languages, an extinct group of languages belonging to the Indo-European family.[1]


The Mon–Khmer languages (also known as Austroasiatic) are the oldest family in Asia. Languages given official status are Vietnamese and Khmer (Cambodian).


The Kra–Dai languages (also known as Tai-Kadai) are found in southern China, Northeast India and Southeast Asia. Languages given official status are Thai (Siamese) and Lao.


The Austronesian languages are widespread throughout Maritime Southeast Asia, including major languages such as Fijian (Fiji), Tagalog (Philippines), and Malay (Malaysia, Singapore, and Brunei). Javanese, Sundanese, and Madurese of Indonesia belong to this family as well.


The Dravidian languages of southern India and parts of Sri Lanka include Tamil, Kannada, Telugu, and Malayalam, while smaller languages such as Gondi and Brahui are spoken in central India and Pakistan respectively.


The Afroasiatic languages (in older sources Hamito-Semitic), particularly its Semitic branch, are spoken in Western Asia. It includes Arabic, Hebrew and Aramaic, in addition to extinct languages such as Akkadian. The Modern South Arabian languages contain a substratum influence from the Cushitic branch of Afroasiatic, which suggests that Cushitic speakers originally inhabited the Arabian Peninsula alongside Semitic speakers.[2]

Siberian families

Besides the Altaic families already mentioned (of which Tungusic is today a minor family of Siberia), there are a number of small language families and isolates spoken across northern Asia. These include the Uralic languages of western Siberia (better known for Hungarian and Finnish in Europe), the Yeniseian languages (linked to Turkic and to the Athabaskan languages of North America), Yukaghir, Nivkh of Sakhalin, Ainu of northern Japan, Chukotko-Kamchatkan in easternmost Siberia, and—just barely—Eskimo–Aleut. Some linguists have noted that the Koreanic languages share more similarities with the Paleosiberian languages than with the Altaic languages. The extinct Ruan-ruan language of Mongolia is unclassified, and does not show genetic relationships with any other known language family.

Caucasian families

Three small families are spoken in the Caucasus: Kartvelian languages, such as Georgian; Northeast Caucasian (Dagestanian languages), such as Chechen; and Northwest Caucasian, such as Circassian. The latter two may be related to each other. The extinct Hurro-Urartian languages may be related as well.

Small families of Southern Asia

Although dominated by major languages and families, there are number of minor families and isolates in South Asia & Southeast Asia. From west to east, these include:

Creoles and pidgins

The eponymous pidgin ("business") language developed with European trade in China. Of the many creoles to have developed, the most spoken today are Chavacano, a Spanish-based creole of the Philippines, and various Malay-based creoles such as Manado Malay influenced by Portuguese. A very well-known Portuguese-based creole is the Kristang, which is spoken in Malacca, a city-state in Malaysia.

Sign languages

A number of sign languages are spoken throughout Asia. These include the Japanese Sign Language family, Chinese Sign Language, Indo-Pakistani Sign Language, as well as a number of small indigenous sign languages of countries such as Nepal, Thailand, and Vietnam. Many official sign languages are part of the French Sign Language family.

Official languages

Asia and Europe are the only two continents where most countries use native languages as their official languages, though English is also widespread as an international language.

Language Native name Speakers Language Family Official Status in a Country Official Status in a Region
Abkhaz Aԥсшәа 240,000 Northwest Caucasian  Abkhazia  Georgia
Arabic العَرَبِيَّة 230,000,000 Afro-Asiatic  Qatar,  Jordan,  Saudi Arabia,  Iraq,  Yemen,  Kuwait,  Bahrain,  Syria,  Palestine(observer state),  Lebanon,  Oman,  UAE,  Israel
Armenian հայերեն 5,902,970 Indo-European  Armenia,  Nagorno-Karabakh
Assamese অসমীয়া 15,000,000 Indo-European  India (in Assam)
Azerbaijani Azərbaycanca 37,324,060 Turkic  Azerbaijan  Iran
Bangla বাংলা 230,000,000 Indo-European  Bangladesh  India (in West Bengal, Tripura, Assam, Andaman and Nicobar islands and Jharkhand)
Tamazight Tamazight 26,000,000 Amazigh  Jordan  Syria
Bodo Boro 1,984,569 Sino-Tibetan  India (in Bodoland)
Burmese မြန်မာစာ 33,000,000 Sino-Tibetan  Myanmar
Cantonese 廣東話/广东话 7,877,900 Sino-Tibetan  Hong Kong  Macau
Mandarin Chinese 普通話/普通话,國語/国语,華語/华语 1,200,000,000 Sino-Tibetan  China,  Taiwan,  Singapore,  Malaysia
Dari دری 19,600,000 Indo-European  Afghanistan
Dhivehi ދިވެހި 400,000 Indo-European  Maldives
Dzongkha རྫོང་ཁ་ 600,000 Sino-Tibetan  Bhutan
English English 301,625,412 Indo-European  Hong Kong, Philippines,  Singapore,  India,  Pakistan,  Malaysia
Filipino Wikang Tagalog 110,784,442 Austronesian  Philippines
Formosan 171,855 Austronesian  Taiwan
Georgian ქართული 4,200,000 Kartvelian  Georgia
Gujarati ગુજરાતી 50,000,000 Indo-European  India (in Gujarat, Daman and Diu and Dadra and Nagar Haveli)
Hakka Thòi-vàn Hak-fa 2,370,000 Sino-Tibetan  Taiwan
Hebrew עברית 7,000,000 Afro-Asiatic  Israel
Hindi हिन्दी 550,000,000 Indo-European  India
Indonesian Bahasa Indonesia 240,000,000 Austronesian  Indonesia  East Timor (as a working language)
Japanese 日本語 120,000,000 Japonic  Japan
Kannada ಕನ್ನಡ 51,000,000 Dravidian  India (in Karnataka)
Karen ကညီကျိး 6,000,000 Sino-Tibetan  Myanmar (in Kayin State)
Kazakh Қазақша 18,000,000 Turkic  Kazakhstan  Russia
Khmer ភាសាខ្មែរ 14,000,000 Austroasiatic  Cambodia
Korean 한국어/조선말 80,000,000 Koreanic  South Korea,  North Korea  China (in Yanbian and Changbai)
Kurdish Kurdî/کوردی 20,000,000 Indo-European  Iraq  Iran
Kyrgyz кыргызча 2,900,000 Turkic  Kyrgyzstan
Lao ພາສາລາວ 7,000,000 Tai-Kadai  Laos
Malay Bahasa Melayu/بهاس ملايو 30,000,000 Austronesian  Malaysia,  Brunei,  Singapore
Malayalam മലയാളം 33,000,000 Dravidian  India (in Kerala, Lakshadweep and Mahe)
Marathi मराठी 73,000,000 Indo-European  India (in Maharashtra and Dadra and Nagar Haveli)
Mongolian Монгол хэлᠮᠣᠩᠭᠣᠯ
2,000,000 Mongolic  Mongolia  China (in Inner Mongolia)
Nepali नेपाली 29,000,000 Indo-European    Nepal  India (in Sikkim and West Bengal)
Odia ଓଡ଼ିଆ 33,000,000 Indo-European  India (in Odisha and Jharkhand)
Ossetian Ирон 540,000 (50,000 in South Ossetia) Indo-European  South Ossetia  Russia (in  North Ossetia–Alania )
Pashto پښتو 45,000,000 Indo-European  Afghanistan  Pakistan
Persian فارسی 50,000,000 Indo-European  Iran
Punjabi پنجابی / ਪੰਜਾਬੀ 100,000,000 Indo-European  India (in Punjab, India, Haryana, Delhi and Chandigarh)  Pakistan (in Punjab, Pakistan)
Portuguese Português 1,200,000 Indo-European  Timor Leste  Macau
Russian Русский 260,000,000 Indo-European  Abkhazia,  Kazakhstan,  Kyrgyzstan,  Russia,  South Ossetia  Uzbekistan,  Tajikistan and  Turkmenistan (as an inter-ethnic language)
Saraiki سرائیکی 18,179,610 Indo-European  Pakistan (in Bahawalpur )
Sinhala සිංහල 18,000,000 Indo-European  Sri Lanka
Tamil தமிழ் 77,000,000 Dravidian  Sri Lanka,  Singapore  India (in Tamil Nadu, Andaman and Nicobar islands and Puducherry)
Telugu తెలుగు 79,000,000 Dravidian  India (in Andhra Pradesh, Telangana, Andaman and Nicobar islands, Puducherry)
Taiwanese Hokkien 臺語 18,570,000 Sino-Tibetan  Taiwan
Tajik тоҷикӣ 7,900,000 Indo-European  Tajikistan
Tetum Lia-Tetun 500,000 Austronesian  Timor Leste
Thai ภาษาไทย 60,000,000 Tai-Kadai  Thailand
Tulu ತುಳು 1,722,768 Dravidian  India (in Mangalore, Udupi, Kasargod, Mumbai)
Turkish Türkçe 70,000,000 Turkic  Turkey,  Cyprus,  Northern Cyprus
Turkmen Türkmençe 7,000,000 Turkic  Turkmenistan
Urdu اُردُو 62,120,540 Indo-European  Pakistan  India (in Jammu and Kashmir, Telangana, Delhi, Bihar and Uttar Pradesh)
Uzbek Oʻzbekcha/ Ўзбекча 25,000,000 Turkic  Uzbekistan
Vietnamese Tiếng Việt 80,000,000 Austroasiatic  Vietnam

See also


  1. ^ Z. Rosser; et al. (2000). "Y-Chromosomal Diversity in Europe is Clinal and Influenced Primarily by Geography, Rather than by Language" (PDF). American Journal of Human Genetics. 67 (6): 1526–1543. doi:10.1086/316890. PMC 1287948. PMID 11078479.
  2. ^ Blažek, Václav. "Afroasiatic Migrations: Linguistic Evidence" (PDF). Retrieved 25 September 2017.
  3. ^ Blench, Roger. 2015. The Mijiic languages: distribution, dialects, wordlist and classification. m.s.
Dicamay Agta language

Dicamay Agta is an extinct Aeta language of the northern Philippines. The Dicamay Agta lived on the Dicamay River, on the western side of the Sierra Madre near Jones, Isabela. The Dicamay Agta were killed by Ilokano homesteaders sometime between 1957 and 1974 (Lobel 2013:98).

Richard Roe collected a Dicamay word list of 291 words in 1957.

Hoti language

Hoti is an extinct language of Seram, Indonesia.

Jiamao language

Jiamao (Chinese: 加茂; pinyin: Jiāmào, Jiamao; also 台 Tái or 塞 Sāi) is a language isolate spoken in southern Hainan, China. Jiamao speakers' autonym is tʰai1.

Kenaboi language

Kĕnaboi is an extinct unclassified language of Negeri Sembilan, Malaysia that may be a language isolate or an Austroasiatic language belonging to the Aslian branch. It is attested in what appear to be two dialects,[1][2] based on two word lists of about 250 lexical items collected around 1880 by D.F.A. Hervey that are cited in Blagden (1906).

Khazar language

Khazar, also known as Khazaric or Khazaris, was the dialect spoken by the Khazars, a group of semi-nomadic Turkic peoples originating from Central Asia. There are few written records of the language, and it is regarded as extinct. Khazar was a Turkic language; however, there is a dispute among scholars as to which branch of the Turkic language family it belongs. One consideration believes it belongs to the Oghur ("lir") branch of the Turkic language family, while another consideration is that it belongs to the Common Turkic branch.

Khitan language

Khitan or Kitan ( in large script or in small, Khitai; Chinese: t 契丹語, Qìdānyǔ), also known as Liao, is a now-extinct language once spoken by the Khitan people (4th to 13th century). It was the official language of the Liao Empire (907–1125) and the Qara Khitai (1124–1218).

Lelak language

Lelak is an extinct language of Malaysian Borneo. The Lelak people now speak Berawan.

Makuva language

Makuva, also known as Maku'a or Lóvaia, is an apparently extinct Austronesian language spoken at the northeast tip of East Timor near the town of Tutuala.

Makuva has been heavily influenced by neighboring East Timorese Papuan languages, to the extent that it was long thought to be a Papuan language. The ethnic population was 50 in 1981, but the younger generation uses Fataluku as their first or second language.

Mysian language

The Mysian language was spoken by Mysians inhabiting Mysia in north-west Anatolia.

Little is known about the Mysian language. Strabo noted that their language was, in a way, a mixture of the Lydian and Phrygian languages. As such, the Mysian language could be a language of the Anatolian group. However, a passage in Athenaeus suggests that the Mysian language was akin to the barely attested Paeonian language of Paeonia, north of Macedon.

A short inscription that could be in Mysian and which dates from between the 5th and 3rd centuries BC was found in Üyücek village in the Tavşanlı district of Kütahya province, and seems to include Indo-European words. However, it is uncertain whether the inscription renders a text in the Mysian language or if it is simply a Phrygian dialect from the region of Mysia.Friedrich's reading:

ΛΙΚΕC : ΒΡΑΤΕΡΑΙC : ΠΑΤΡΙΖΙ : ΙCΚLatin transliteration:

likes : braterais patrizi iskThe words "braterais patrizi isk" have been proposed to mean something like "for brothers and fathers", while Likes is most probably a personal name.

Ruanruan language

Ruan-ruan (Chinese: 蠕蠕; also called Rouran) is an unclassified extinct language of Mongolia and northern China, spoken in the Rouran Khaganate from the 4th to the 6th centuries CE.

Alexander Vovin (2004, 2010) considers the Ruan-ruan language to be an extinct non-Altaic language that is not related to any modern-day language (i.e., a language isolate) and is hence unrelated to Mongolic. Vovin (2004) notes that Old Turkic had borrowed some words from an unknown non-Altaic language that may have been Ruan-ruan. The Ruan-ruan language is possibly related to the Yeniseian languages.

Sabüm language

Sabüm is an extinct aboriginal Mon–Khmer language of Malaya.

Saka language

(Eastern) Saka or Sakan was a variety of Eastern Iranian languages, attested from the ancient Buddhist kingdoms of Khotan, Kashgar and Tumshuq in the Tarim Basin, in what is now southern Xinjiang, China. It is a Middle Iranian language. The two kingdoms differed in dialect, their speech known as Khotanese and Tumshuqese.

Documents on wood and paper were written in modified Brahmi script with the addition of extra characters over time and unusual conjuncts such as ys for z. The documents date from the fourth to the eleventh century. Tumshuqese was more archaic than Khotanese, but it is much less understood because it appears in fewer manuscripts compared to Khotanese. Both dialects share features with modern Pashto and Wakhi. The language was known as "Hvatanai" in contemporary documents. Many Prakrit terms were borrowed from Khotanese into the Tocharian languages.

Sidetic language

The Sidetic language is a member of the extinct Anatolian branch of the Indo-European language family known from legends of coins dating to the period of approx. the 5th to 3rd centuries BCE found in Side at the Pamphylian coast, and two Greek–Sidetic bilingual inscriptions from the 3rd and 2nd centuries BCE respectively. The Greek historian Arrian in his Anabasis Alexandri (mid-2nd century CE) mentions the existence of a peculiar indigenous language in the city of Side.

Sidetic was probably closely related to Lydian, Carian and Lycian.

The Sidetic script is an alphabet of the Anatolian group. It has 25 letters, only a few of which are clearly derived from Greek. It is analysed from coin legends in what is possibly Sidetic. The script is essentially undeciphered.

Tambora language

Tambora is the poorly attested non-Austronesian (Papuan) language of the Tambora culture of central Sumbawa, in what is now Indonesia, that was wiped out by the 1815 eruption of Mount Tambora. It was the westernmost known Papuan language, and was relatively unusual among such languages in being the language of a maritime trading state, though contemporary Papuan trading states were also found off Halmahera in Ternate and Tidore.

Tuoba language

Tuoba (Tabγač or Tabghach; Chinese: 拓跋) is an extinct Mongolic or Turkic language spoken by the Tuoba people in northern China around the 5th century AD during the Northern Wei dynasty.

Alexander Vovin (2007) identifies the extinct Tabγač or Tuoba language as a Mongolic language. However, Chen (2005) argues that Tuoba (Tabγač) was a Turkic language.

Turung language

The Turung language (Tailung, Tairong, Thai: (ภาษาไทตุรุง, pasa tai turung)) is an extinct Tai language formerly spoken in Assam. The Turung people who spoke this language now speak Assamese or Singpho languages.

The total population of the ethnic group is over 30,000 and primarily live in Jorhat, Golaghat and Karbi Anglong districts of Assam.

Tuyuhun language

Tuyuhun (Chinese: 吐谷渾) is an extinct language once spoken by the Tuyuhun of northern China about 500 AD.

Alexander Vovin (2015) identifies the extinct Tuyuhun language as a Para-Mongolic language, meaning that Tuyuhun is related to the Mongolic languages as a sister clade but is not directly descended from the Proto-Mongolic language. The Khitan language is also a Para-Mongolic language. Tuyuhun had previously been identified by Paul Pelliot (1921) as a Mongolic language.

Zhang-Zhung language

Zhang-Zhung (Tibetan: ཞང་ཞུང་, Wylie: zhang zhung) is an extinct Sino-Tibetan language that was spoken in what is now western Tibet. It is attested in a bilingual text called A Cavern of Treasures (mDzod phug) and several shorter texts.

A small number of documents preserved in Dunhuang contain an undeciphered language that has been called Old Zhangzhung, but the identification is controversial.

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.