The Indo-Aryan or Indic languages, are a major language family of South Asia (or the Indian subcontinent). They constitute a branch of the Indo-Iranian languages, itself a branch of the Indo-European language family. In the early 21st century, Indo-Aryan languages were spoken by more than 800 million people, primarily in India, Bangladesh, Nepal, Pakistan and Sri Lanka. Moreover, there are large immigrant and/or expatriate Indo-Aryan speaking communities in Northwestern Europe, Western Asia, North America and Australia. There are about 219 known Indo-Aryan languages.
The largest in terms of speakers are Hindustani (Hindi-Urdu, about 329 million), Bengali (242 million), Punjabi (about 100 million) and other languages, with a 2005 estimate placing the total number of native speakers at nearly 900 million.
|ISO 639-2 / 5||inc|
Proto-Indo-Aryan, or sometimes Proto-Indic, is the reconstructed proto-language of the Indo-Aryan languages. It is intended to reconstruct the language of the pre-Vedic Indo-Aryans. Proto-Indo-Aryan is meant to be the predecessor of Old Indo-Aryan (1500–300 BCE) which is directly attested as Vedic and Mitanni-Aryan. Despite the great archaicity of Vedic, however, the other Indo-Aryan languages preserve a small number of archaic features lost in Vedic.
From the Rigvedic language, "Sanskrit" (literally "put together", meaning perfected or elaborated) developed as the prestige language of culture, science and religion, as well as the court, theatre, etc. Sanskrit is, by convention, referred to by modern scholars as 'Classical Sanskrit' in contradistinction to the so-called 'Rigvedic Sanskrit', which is largely intelligible to Sanskrit speakers.
Mitanni inscriptions show some middle Indo-Aryan characteristics along with Old Indic, for example sapta in old Indo-Aryan becomes satta ('pt' is transformed into middle indo aryan 'tt'). According to S.S. Misra this language can be similar to Buddhist hybrid sanskrit which might not be a mixed language but an early middle Indo-Aryan occurring much before prakrit.[n 1][n 2]
Outside the learned sphere of Sanskrit, vernacular dialects (Prakrits) continued to evolve. The oldest attested Prakrits are the Buddhist and Jain canonical languages Pali and Ardhamagadhi Prakrit, respectively. By medieval times, the Prakrits had diversified into various Middle Indo-Aryan languages. Apabhraṃśa is the conventional cover term for transitional dialects connecting late Middle Indo-Aryan with early Modern Indo-Aryan, spanning roughly the 6th to 13th centuries. Some of these dialects showed considerable literary production; the Śravakacāra of Devasena (dated to the 930s) is now considered to be the first Hindi book.
The next major milestone occurred with the Muslim conquests in the Indian subcontinent in the 13th–16th centuries. Under the flourishing Turco-Mongol Mughal Empire, Persian became very influential as the language of prestige of the Islamic courts due to adoptation of the foreign language by the Mughal emperors. However, Persian was soon displaced by Hindustani. This Indo-Aryan language is a combination with Persian, Arabic, and Turkic elements in its vocabulary, with the grammar of the local dialects.
The Indo-Aryan languages of North India and Pakistan form a dialect continuum. What is called "Hindi" in India is frequently Standard Hindi, the Sanskritized version of the colloquial Hindustani spoken in the Delhi area since the Mughals. However, the term Hindi is also used for most of the central Indic dialects from Bihar to Rajasthan. The spoken New Indo-Aryan dialects from Assam in the east to the borders of Afghanistan in the west form a linguistic continuum across the plains of North India, Pakistan and Bangladesh.
In the Central Zone Hindi-speaking areas, for a long time the prestige dialect was Braj Bhasha, but this was replaced in the 19th century by the Khariboli-based Hindustani. Hindustani was strongly influenced by Sanskrit and Persian, with these influences leading to the emergence of Modern Standard Hindi and Modern Standard Urdu as registers of the Hindustani language. This state of affairs continued until the division of the British Indian Empire in 1947, when Hindi became the official language in India and Urdu became official in Pakistan. Despite the different script the fundamental grammar remains identical, the difference is more sociolinguistic than purely linguistic. Today it is widely understood/spoken as a second or third language throughout South Asia and one of the most widely known languages in the world in terms of number of speakers.
Some theonyms, proper names and other terminology of the Mitanni exhibit an Indo-Aryan superstrate, suggest that a Indo-Aryan elite imposed itself over the Hurrians in the course of the Indo-Aryan expansion. In a treaty between the Hittites and the Mitanni, the deities Mitra, Varuna, Indra, and the Ashvins (Nasatya) are invoked. Kikkuli's horse training text includes technical terms such as aika (eka, one), tera (tri, three), panza (pancha, five), satta (sapta, seven), na (nava, nine), vartana (vartana, turn, round in the horse race). The numeral aika "one" is of particular importance because it places the superstrate in the vicinity of Indo-Aryan proper as opposed to Indo-Iranian or early Iranian (which has "aiva") in general
Another text has babru (babhru, brown), parita (palita, grey), and pinkara (pingala, red). Their chief festival was the celebration of the solstice (vishuva) which was common in most cultures in the ancient world. The Mitanni warriors were called marya, the term for warrior in Sanskrit as well; note mišta-nnu (= miẓḍha, ≈ Sanskrit mīḍha) "payment (for catching a fugitive)" (M. Mayrhofer, Etymologisches Wörterbuch des Altindoarischen, Heidelberg, 1986–2000; Vol. II:358).
Sanskritic interpretations of Mitanni royal names render Artashumara (artaššumara) as Ṛtasmara "who thinks of Ṛta" (Mayrhofer II 780), Biridashva (biridašṷa, biriiašṷa) as Prītāśva "Whose Horse is Dear" (Mayrhofer II 182), Priyamazda (priiamazda) as Priyamedha "whose wisdom is dear" (Mayrhofer II 189, II378), Citrarata as Citraratha "Whose Chariot is Shining" (Mayrhofer I 553), Indaruda/Endaruta as Indrota "helped by Indra" (Mayrhofer I 134), Shativaza (šattiṷaza) as Sātivāja "Winning the Race Price" (Mayrhofer II 540, 696), Šubandhu as Subandhu "Having Good Relatives" (a name in Palestine, Mayrhofer II 209, 735), Tushratta (tṷišeratta, tušratta, etc.) as *tṷaiašaratha, Vedic Tvastar "Whose Chariot is Vehement" (Mayrhofer, Etym. Wb., I 686, I 736).
Domari is an Indo-Aryan language spoken by older Dom people scattered across the MENA. The language is reported to be spoken as far north as Azerbaijan and as far south as central Sudan, in Turkey, Iran, Afghanistan, Pakistan, India, Iraq, Palestine, Israel, Jordan, Egypt, Sudan, Libya, Tunisia, Algeria, Morocco, Syria and Lebanon. Based on the systematicity of sound changes, we know with a fair degree of certainty that the names Domari and Romani derive from the Indo-Aryan word ḍom.
The Romani language is usually included in the Western Indo-Aryan languages. Romani—spoken mainly in various parts of Europe—is conservative in maintaining almost intact the Middle Indo-Aryan present-tense person concord markers, and in maintaining consonantal endings for nominal case—both features that have been eroded in most other modern languages of Central India. It shares an innovative pattern of past-tense person concord with the languages of the Northwest, such as Kashmiri and Shina. This is believed to be further proof that Romani originated in the Central region, then migrated to the Northwest.
There are no known historical documents about the early phases of the Romani language.
Linguistic evaluation carried out in the nineteenth century by Pott (1845) and Miklosich (1882–1888) showed that the Romani language is to be classed as a New Indo-Aryan language (NIA), not Middle Indo-Aryan (MIA), establishing that the ancestors of the Romani could not have left India significantly earlier than AD 1000.
The principal argument favouring a migration during or after the transition period to NIA is the loss of the old system of nominal case, and its reduction to just a two-way case system, nominative vs. oblique. A secondary argument concerns the system of gender differentiation. Romani has only two genders (masculine and feminine). Middle Indo-Aryan languages (named MIA) generally had three genders (masculine, feminine and neuter), and some modern Indo-Aryan languages retain this old system even today.
It is argued that loss of the neuter gender did not occur until the transition to NIA. Most of the neuter nouns became masculine while a few feminine, like the neuter अग्नि (agni) in the Prakrit became the feminine आग (āg) in Hindi and jag in Romani. The parallels in grammatical gender evolution between Romani and other NIA languages have been cited as evidence that the forerunner of Romani remained on the Indian subcontinent until a later period, perhaps even as late as the tenth century.
There can be no definitive enumeration of Indic languages because their dialects merge into one another. The major ones are illustrated here; for the details, see the dedicated articles.
The classification follows Masica (1991) and Kausen (2006).
Ethnologue lists the following languages under the Western Zone that are not already covered in other subgroups:
Parya - 4,000 speakers
Parya historically belonged to the Central Zone but lost intelligibility with other languages of the group due to geographic distance and numerous grammatical and lexical innovations.
These languages derive from Magadhan Apabhraṃśa Prakrit.
This group of languages developed from Maharashtri Prakrit. It is not clear if Dakhini (Deccani, Southern Urdu) is part of Hindustani along with Standard Urdu, or a separate Persian-influenced development from Marathi.
The Insular Indic languages share several characteristics that set them apart significantly from the continental languages.
The following languages are related to each other, but otherwise unclassified within Indo-Aryan:
The following other poorly attested languages are listed as unclassified within the Indo-Aryan family by Ethnologue 17:
Also Degaru, Mina, Bhalay and Gowlan are all names for the Gowli caste, rather than a language.
The normative system of New Indo-Aryan stops consists of five points of articulation: labial, dental, "retroflex", palatal, and velar, which is the same as that of Sanskrit. The "retroflex" position may involve retroflexion, or curling the tongue to make the contact with the underside of the tip, or merely retraction. The point of contact may be alveolar or postalveolar, and the distinctive quality may arise more from the shaping than from the position of the tongue. Palatals stops have affricated release and are traditionally included as involving a distinctive tongue position (blade in contact with hard palate). Widely transcribed as [tʃ], Masica (1991:94) claims [cʃ] to be a more accurate rendering.
Moving away from the normative system, some languages and dialects have alveolar affricates [ts] instead of palatal, though some among them retain [tʃ] in certain positions: before front vowels (esp. /i/), before /j/, or when geminated. Alveolar as an additional point of articulation occurs in Marathi and Konkani where dialect mixture and others factors upset the aforementioned complementation to produce minimal environments, in some West Pahari dialects through internal developments (*t̪ɾ, t̪ > /tʃ/), and in Kashmiri. The addition of a retroflex affricate to this in some Dardic languages maxes out the number of stop positions at seven (barring borrowed /q/), while a reduction to the inventory involves *ts > /s/, which has happened in Assamese, Chittagonian, Sinhala (though there have been other sources of a secondary /ts/), and Southern Mewari.
Further reductions in the number of stop articulations are in Assamese and Romany, which have lost the characteristic dental/retroflex contrast, and in Chittagonian, which may lose its labial and velar articulations through spirantization in many positions (> [f, x]). 
|/p/, /t̪/, /ʈ/, /tʃ/, /k/||Hindi, Punjabi, Dogri, Sindhi, Gujarati, Bihari, Maithili, Sinhala, Odia, Standard Bengali, dialects of Rajasthani (except Lamani, NW. Marwari, S. Mewari)|
|/p/, /t̪/, /ʈ/, /ts/, /k/||Nepali, dialects of Rajasthani (Lamani and NW. Marwari), Northern Lahnda's Kagani, Kumauni, many West Pahari dialects (not Chamba Mandeali, Jaunsari, or Sirmauri)|
|/p/, /t̪/, /ʈ/, /ts/, /tʃ/, /k/||Marathi, Konkani, certain W. Pahari dialects (Bhadrawahi, Bhalesi, Padari, Simla, Satlej, maybe Kulu), Kashmiri|
|/p/, /t̪/, /ʈ/, /ts/, /tʃ/, /tʂ/, /k/||Shina, Bashkarik, Gawarbati, Phalura, Kalasha, Khowar, Shumashti, Kanyawali, Pashai|
|/p/, /t̪/, /ʈ/, /k/||Rajasthani's S. Mewari|
|/p/, /t̪/, /t/, /ts/, /tɕ/, /k/||E. and N. dialects of Bengali (Dhaka, Mymensing, Rajshahi)|
|/p/, /t/, /k/||Assamese|
|/p/, /t/, /tʃ/, /k/||Romani|
|/t̪/, /ʈ/, /k/ (with /i/ and /u/)||Sylheti|
Sanskrit was noted as having five nasal-stop articulations corresponding to its oral stops, and among modern languages and dialects Dogri, Kacchi, Kalasha, Rudhari, Shina, Saurasthtri, and Sindhi have been analyzed as having this full complement of phonemic nasals /m/ /n/ /ɳ/ /ɲ/ /ŋ/, with the last two generally as the result of the loss of the stop from a homorganic nasal + stop cluster ([ɲj] > [ɲ] and [ŋɡ] > [ŋ]), though there are other sources as well.
The following are consonant systems of major and representative New Indo-Aryan languages, as presented in Masica (1991:106–107), though here they are in IPA. Parentheses indicate those consonants found only in loanwords: square brackets indicate those with "very low functional load". The arrangement is roughly geographical.
In the context of South Asia, the choice between the appellations "language" and "dialect" is a difficult one, and any distinction made using these terms is obscured by their ambiguity. In one general colloquial sense, a language is a "developed" dialect: one that is standardised, has a written tradition and enjoys social prestige. As there are degrees of development, the boundary between a language and a dialect thus defined is not clear-cut, and there is a large middle ground where assignment is contestable. There is a second meaning of these terms, in which the distinction is drawn on the basis of linguistic similarity. Though seemingly a "proper" linguistics sense of the terms, it is still problematic: methods that have been proposed for quantifying difference (for example, based on mutual intelligibility) have not been seriously applied in practice; and any relationship established in this framework is relative.
... Hindustani is the basis for both languages ...
Bhatri is an Indic language spoken in Chhattisgarh, India.Bihari languages
Bihari is the western group of Eastern Indo-Aryan languages, mainly spoken in the Indian states of Bihar, Jharkhand, West Bengal and Uttar Pradesh and also in Nepal.
Despite the large number of speakers of these languages, only Maithili has been constitutionally recognised in India, which gained constitutional status via the 92nd amendment to the Constitution of India, of 2003 (gaining assent in 2004).
Both Maithili and Bhojpuri have constitutional recognition in Nepal.In Bihar, Hindi is the language used for educational and official matters. These languages were legally absorbed under the overarching label Hindi in the 1961 Census. Such state and national politics are creating conditions for language endangerments. After independence Hindi was given the sole official status through the Bihar Official Language Act, 1950. Hindi was displaced as the sole official language of Bihar in 1981, when Urdu was accorded the status of the second official language.Central Indo-Aryan languages
The Central Indo-Aryan languages are a group of related language varieties spoken across northern and central India. These language varieties form the central part of the Indo-Aryan language family, itself a part of the Indo-European language family. They historically form a dialect continuum that descends from the Madhya Prakrits. Located in the Hindi Belt, the Central Zone language varieties also includes the Khariboli dialect, the primary dialect spoken in Delhi and the basis of modern Hindi and Urdu. This dialect developed over centuries into the medieval Hindustani language, of which Modern Standard Hindi and Modern Standard Urdu are today derived from. Both Hindi and Urdu are standardizations of the Hindustani language that was historically spoken in Delhi and used as a lingua franca across Northern India. In regards to the Indo-Aryan language family, the coherence of this language group depends on the classification being used; here only Eastern and Western Hindi will be considered.Chinali language
Chinali is an unclassified language of India. Many speakers are well educated, and say that their language is "closely related" to Sanskrit. Speakers are distributed throughout Lahul (or Lahaul) Valley.
It is spoken by around 750+ people.Danwar language
Danwar (also rendered Danuwar, Denwar, Dhanvar, Dhanwar), is a language spoken in parts of Nepal by an Indo-Aryan ethnic group of fifty thousand. It is close to Bote-Darai but otherwise unclassified within the Indo-Aryan languages.
A variety called Danwar Rai is distinct and may be a separate language. It is not related to the Rai languages of the Tibeto-Burman family.Halbic languages
The Halbic languages are Indic varieties transitional between the Odia and Marathi. They are:
The Indo-Aryan peoples or the Indic peoples are a diverse Indo-European-speaking ethnolinguistic group of speakers of Indo-Aryan languages. There are over one billion native speakers of Indo-Aryan languages, most of them native to the Indian subcontinent and presently found all across South Asia, where they form the majority.Kamar language
Kamar is an Indic language spoken by a tribal people of central India. It is spoken in two districts, one in Madhya Pradesh and one in Chhattisgarh.Kharia Thar language
Kharia Thar is an Indic language spoken by the Hill Kharia culture of India.Kochila Tharu
Kochila Tharu, also called Morangiya, Septari or Saptariya Tharu, Madhya-Purbiya Tharu, and Mid-Eastern Tharu, is a diverse group of language varieties in the Tharu group of the Indo-Aryan languages. The several names of the varieties refer to the regions where they dominate. It is one of the largest subgroupings of Tharu. It is spoken mainly in Nepal and India, with approximately 250,000 speakers as of 2003. In addition to language, cultural markers around attire and customs connect individuals into the ethnic identity Kochila.
Heavily concentrated in the eastern area of Terai, speakers of Kochila Tharu live in linguistically diverse regions and are generally multilingual (with the exception of some elderly female speakers). A 2013 survey by SIL International found that the language was being taught to children as their first language and used conversationally between multiple generations of speakers, characteristics of a "vigorous" language as defined by the Ethnologue Expanded Graded Intergenerational Disruption Scale (EGIDS).Kurmukar language
Kurmukar is a language of India that is related to—and perhaps a dialect of—Bengali.Magadhi Prakrit
Magadhi Prakrit (Māgadhī) was a vernacular Middle Indo-Aryan language, replacing earlier Vedic Sanskrit in parts of the Indian subcontinents. It was spoken in present-day Assam, Odisha, Bengal, Bihar, and eastern Uttar Pradesh, and used in some dramas to represent vernacular dialogue in Prakrit dramas. It is believed to be the language spoken by the important religious figures Gautama Buddha and Mahavira and was also the language of the courts of the Magadha mahajanapada and the Maurya Empire; some of the Edicts of Ashoka were composed in it.Magadhi Prakrit later evolved into the Eastern Indo-Aryan languages, including the Bengali–Assamese languages (Assamese, Bengali, Chakma, Chittagonian, Rohingya, Sylheti and others), Bihari languages (Bhojpuri, Magahi, Maithili and others), and Odia, among others. Out of all of its offshoots, Bengali is the most spoken, with over 240 million speakers, followed by Odia and Maithili (both with over 40 million speakers) as well as Bhojpuri (with over 30 million speakers).Middle Indo-Aryan languages
The Middle Indo-Aryan languages (or Middle Indic languages, sometimes conflated with the Prakrits, which are a stage of Middle Indic) are a historical group of languages of the Indo-Aryan family. They are the descendants of Old Indo-Aryan (attested in Vedic Sanskrit) and the predecessors of the modern Indo-Aryan languages, such as Hindustani (Hindi-Urdu), Odia, Assamese, Bengali and Punjabi.
The Middle Indo-Aryan (MIA) stage in the evolution of Indo-Aryan languages is thought to have spanned more than a millennium between 600 BCE and 1000 CE, and is often divided into three major subdivisions.
The early stage is represented by the Ardhamagadhi of the Edicts of Ashoka (c. 250 BC) and Jain Agamas, and by the Pali of the Tripitakas.
The middle stage is represented by the various literary Prakrits, especially the Shauraseni language and Maharashtri and Magadhi Prakrits. The term Prakrit is also often applied to Middle Indo-Aryan languages (prākṛta literally means "natural" as opposed to saṃskṛta, which literally means "constructed" or "refined"). Modern scholars such as Michael C. Shapiro follow this classification by including all Middle Indo-Aryan languages under the rubric of "Prakrits", while others emphasise the independent development of these languages, often separated from Sanskrit by social and geographic differences.
The late stage is represented by the Apabhraṃśas of the 6th century and later that preceded early Modern Indo-Aryan languages (such as Braj Bhasha).Mirgan language
Mirgan, or Panika, is an Indic language of eastern India.Nahari language
Nahari is an Indo-Aryan language spoken in the states of Chhattisgarh and Odisha in India.Northern Indo-Aryan languages
The Northern Indo-Aryan languages, also known as Pahāṛi languages, are a group of Indo-Aryan languages spoken in the lower ranges of the Himalayas, from Nepal in the east, through the Indian states of Uttarakhand and Himachal Pradesh. The name Pahari (not to be confused with the various other languages with that name) is Grierson's term.Schwa deletion in Indo-Aryan languages
Schwa deletion, or schwa syncope, is a phenomenon that sometimes occurs in Hindi, Urdu, Bengali, Kashmiri, Punjabi, Gujarati, and several other Indo-Aryan languages with schwas that are implicit in their written scripts. Languages like Marathi and Maithili with increased influence from other languages through coming into contact with them — also shows a similar phenomenon. Some schwas are obligatorily deleted in pronunciation even if the script suggests otherwise.Schwa deletion is important for intelligibility and unaccented speech. It also presents a challenge to non-native speakers and speech synthesis software because the scripts, including Devanagari, do not tell when schwas should be deleted.For example, the Sanskrit word "Rāma" (IPA: [ɽaːmɐ], राम) is pronounced "Rām" (IPA: [raːm], राम्) in Hindi. The schwa (ə) sound at the end of the word is deleted in Hindi. However, in both cases, the word is written राम.
The schwa is not deleted in ancient languages such as Sanskrit or Pali.Varhadi dialect
Varhadi is a dialect of Marathi spoken in Vidarbha region of Maharashtra and by Marathi people of adjoining parts of Madhya Pradesh, Chhattisgarh and Telangana in India.Wagdi language
For the Arabic name Wagdi, or Wagdy or Wajdi, see Wajdi
Wagdi (Vaghri) is one of the Bhil languages of India spoken mainly in Dungarpur and Banswara districts of Southern Rajasthan. Wagdi has been characterized as a dialect of Bhili.There are three dialects of Wagdi: Aspur, Kherwara, Sagwara and Adivasi Wagdi.
Old and Middle Indo-Aryan languages
Modern Indo-Aryan languages
Major languages of South Asia