Proto-Indo-European (PIE) is the linguistic reconstruction of the hypothetical common ancestor of the Indo-European languages, the most widely spoken language family in the world. Far more work has gone into reconstructing PIE than any other proto-language, and it is by far the best understood of all proto-languages of its age. The vast majority of linguistic work during the 19th century was devoted to the reconstruction of PIE or its daughter proto-languages (e.g. Proto-Germanic), and most of the modern techniques of linguistic reconstruction such as the comparative method were developed as a result. These methods supply all of the knowledge concerning PIE since there is no written record of the language.
PIE is estimated to have been spoken as a single language from 4,500 B.C.E. to 2,500 B.C.E. during the Neolithic Age, though estimates vary by more than a thousand years. According to the prevailing Kurgan hypothesis, the original homeland of the Proto-Indo-Europeans may have been in the Pontic–Caspian steppe of Eastern Europe. The linguistic reconstruction of PIE has also provided insight into the culture and religion of its speakers. As Proto-Indo-Europeans became isolated from each other through the Indo-European migrations, the dialects of PIE spoken by the various groups diverged by undergoing certain sound laws and shifts in morphology to transform into the known ancient and modern Indo-European languages.
PIE had an elaborate system of morphology that included inflectional suffixes as well as ablaut (vowel alterations, for example, as preserved in English sing, sang, sung) and accent. PIE nominals and pronouns had a complex system of declension, and verbs similarly had a complex system of conjugation. The PIE phonology, particles, numerals, and copula are also well-reconstructed. Today, the most widely-spoken daughter languages of PIE are Spanish, English, Hindustani (Hindi and Urdu), Portuguese, Bengali, Russian, Punjabi, German, Persian, French, Italian and Marathi.
The comparative method follows the Neogrammarian rule: the Indo-European sound laws apply without exception. The method compares languages and uses the sound laws to find a common ancestor. For example, compare the pairs of words in Italian and English: piede and foot, padre and father, pesce and fish. Since there is a consistent correspondence of the initial consonants that emerges far too frequently to be coincidental, one can assume that these languages stem from a common parent-language.
Many consider William Jones, an Anglo-Welsh philologist and puisne judge in Bengal, to have begun Indo-European studies when he postulated the common ancestry of Sanskrit, Latin, and Greek. Although his name is closely associated with this observation, he was not the first to make it. In the 1500s, European visitors to the Indian subcontinent became aware of similarities between Indo-Iranian languages and European languages, and as early as 1653 Marcus Zuerius van Boxhorn had published a proposal for a proto-language ("Scythian") for the following language families: Germanic, Romance, Greek, Baltic, Slavic, Celtic, and Iranian. In a memoir sent to the Académie des Inscriptions et Belles-Lettres in 1767 Gaston-Laurent Coeurdoux, a French Jesuit who spent all his life in India, had specifically demonstrated the analogy between Sanskrit and European languages.
In 1818 Rasmus Christian Rask elaborated the set of correspondences to include other Indo-European languages, such as Sanskrit and Greek, and the full range of consonants involved. In 1816 Franz Bopp published On the System of Conjugation in Sanskrit in which he investigated a common origin of Sanskrit, Persian, Greek, Latin, and German. In 1833 he began publishing the Comparative Grammar of Sanskrit, Zend, Greek, Latin, Lithuanian, Old Slavic, Gothic, and German.
In 1822 Jacob Grimm formulated what became known as Grimm's law as a general rule in his Deutsche Grammatik. Grimm showed correlations between the Germanic and other Indo-European languages and demonstrated that sound change affects an entire language systematically, and not just some words. From the 1870s the Neogrammarians proposed that sound laws have no exceptions, as shown in Verner's law, published in 1876, which resolved apparent exceptions to Grimm's law by exploring the role that accent (stress) had played in language change.
August Schleicher's A Compendium of the Comparative Grammar of the Indo-European, Sanskrit, Greek and Latin Languages (1874–77) represented an early attempt to reconstruct the proto-Indo-European language.
By the early 1900s Indo-Europeanists had developed well-defined descriptions of PIE which scholars still accept today. Major developments since then include the discovery of the Anatolian and Tocharian languages and the acceptance of the laryngeal theory. This theory aims to produce greater regularity in the linguistic reconstruction of Proto-Indo-European phonology than in the reconstruction generated by the comparative method.
Julius Pokorny's Indogermanisches etymologisches Wörterbuch ("Indo-European Etymological Dictionary", 1959) gave a detailed, though conservative, overview of the lexical knowledge accumulated up until that time. Kuryłowicz's 1956 Apophonie gave a better understanding of Indo-European ablaut. From the 1960s, knowledge of Anatolian became robust enough to establish its relationship to PIE.
Multiple hypotheses have been suggested about when, where, and by whom PIE was spoken with the Kurgan hypothesis, first put forward by Marija Gimbutas, being the most popular of these. It proposes that Kurgans from the Pontic–Caspian steppe north of the Black Sea were the original speakers of PIE.
According to the theory, PIE became widespread because its speakers, the Kurgans, were able to migrate into a vast area of Europe and Asia, thanks to technologies such as the domestication of the horse, herding, and the use of wheeled vehicles.
|Proto-Anatolian||All now extinct, the best attested being the Hittite language.||None|
|Proto-Tocharian||An extinct branch known from manuscripts dating from the 6th to the 8th century AD, which were found in north-west China.||None|
|Proto-Italic||This included many languages, but only descendants of Latin survive.||Portuguese and Galician, Spanish, Catalan, French, Italian, Romanian, Aromanian, Rhaeto-Romance, Gallo-Italic|
|Proto-Celtic||The ancestor language of all known Celtic languages. These languages were once spoken across Europe, but modern Celtic languages are mostly confined to the north-western edge of Europe.||Irish, Scottish Gaelic, Welsh, Breton, Cornish, Manx|
|Proto-Germanic||The reconstructed proto-language of the Germanic languages. It developed into three branches: West Germanic, East Germanic (now extinct), and North Germanic.||English, German, Afrikaans, Dutch, Norwegian, Danish, Swedish, Frisian, Icelandic, Faroese|
|Proto-Balto-Slavic||Branched into the Baltic languages and the Slavic languages.||Baltic Latvian and Lithuanian; Slavic Russian, Ukrainian, Belarussian, Polish, Czech, Slovak, Serbo-Croatian, Bulgarian, Slovenian, Macedonian|
|Proto-Indo-Iranian||Branched into the Indo-Aryan, Iranian and Nuristani languages.||Nuristani; Indic Hindustani, Bengali, Punjabi, Dardic; Iranic Persian, Pashto, Balochi, Kurdish, Zaza|
|Proto-Armenian||Eastern Armenian, Western Armenian|
|Proto-Greek||Modern Greek, Romeyka, Tsakonian|
|Proto-Albanian||Albanian is the only modern representative of a distinct branch of the Indo-European language family.||Albanian|
The Paleo-Balkan languages, which occur in or near the Balkan peninsula, do not appear to be members of any of the subfamilies of PIE but are so poorly attested that proper classification of them is not possible.
Proto-Indo-European phonology has been reconstructed in some detail. Notable features of the most widely accepted (but not uncontroversial) reconstruction include three series of stop consonants reconstructed as voiceless, voiced, and breathy voiced; sonorant consonants that could be used syllablically; three so-called laryngeal consonants, whose exact pronunciation is not well-established but which are believed to have existed in part based on their visible effects on adjacent sounds; the fricative /s/; and a five-vowel system of which /e/ and /o/ were the most frequently occurring vowels.
The Proto-Indo-European accent is reconstructed today as having had variable lexical stress, which could appear on any syllable and whose position often varied among different members of a paradigm (e.g. between singular and plural of a verbal paradigm). Stressed syllables received a higher pitch; therefore it is often said that PIE had pitch accent. The location of the stress is associated with ablaut variations, especially between normal-grade vowels (/e/ and /o/) and zero-grade (i.e. lack of a vowel), but not entirely predictable from it.
The accent is best preserved in Vedic Sanskrit and (in the case of nouns) Ancient Greek, and indirectly attested in a number of phenomena in other IE languages. To account for mismatches between the accent of Vedic Sanskrit and Ancient Greek, as well as a few other phenomena, a few historical linguists prefer to reconstruct PIE as a tone language where each morpheme had an inherent tone; the sequence of tones in a word then evolved, according to that hypothesis, into the placement of lexical stress in different ways in different IE branches.
Proto-Indo-European roots were affix-lacking morphemes which carried the core lexical meaning of a word and were used to derive related words (e.g., "-friend-" in the English words "befriend", "friends", and "friend" by itself). Proto-Indo-European was a fusional language, in which inflectional morphemes signalled the grammatical relationships between words. This dependence on inflectional morphemes means that roots in PIE, unlike those found in English, were rarely found by themselves. A root plus a suffix formed a word stem, and a word stem plus a desinence (usually an ending) formed a word.
Many morphemes in Proto-Indo-European had short e as their inherent vowel; the Indo-European ablaut is the change of this short e to short o, long e (ē), long o (ō), or no vowel. This variation in vowels occurred both within inflectional morphology (e.g., different grammatical forms of a noun or verb may have different vowels) and derivational morphology (e.g., a verb and an associated abstract verbal noun may have different vowels).
Categories that PIE distinguished through ablaut were often also identifiable by contrasting endings, but the loss of these endings in some later Indo-European languages has led them to use ablaut alone to identify grammatical categories, as in the Modern English words sing, sang, sung.
There were three grammatical genders:
Proto-Indo-European pronouns are difficult to reconstruct, owing to their variety in later languages. PIE had personal pronouns in the first and second grammatical person, but not the third person, where demonstrative pronouns were used instead. The personal pronouns had their own unique forms and endings, and some had two distinct stems; this is most obvious in the first person singular where the two stems are still preserved in English I and me. There were also two varieties for the accusative, genitive and dative cases, a stressed and an enclitic form.
|First person||Second person|
|Accusative||*h₁mé, *h₁me||*nsmé, *nōs||*twé||*usmé, *wōs|
|Genitive||*h₁méne, *h₁moi||*ns(er)o-, *nos||*tewe, *toi||*yus(er)o-, *wos|
|Dative||*h₁méǵʰio, *h₁moi||*nsmei, *ns||*tébʰio, *toi||*usmei|
Verbs have at least four grammatical moods:
Verbs had two grammatical voices:
Verbs had three grammatical persons: (first, second and third)
Verbs had three grammatical numbers:
The following table shows a possible reconstruction of the PIE verb endings from Sihler, which largely represents the current consensus among Indo-Europeanists.
Proto-Indo-European numerals are generally reconstructed as follows:
|three||*trei- (full grade), *tri- (zero grade)|
|four||*kʷetwor- (o-grade), *kʷetur- (zero grade)
(see also the kʷetwóres rule)
|six||*s(w)eḱs; originally perhaps *weḱs|
|eight||*oḱtō, *oḱtou or *h₃eḱtō, *h₃eḱtou|
Rather than specifically 100, *ḱm̥tóm may originally have meant "a large number".
Proto-Indo-European particles could be used both as adverbs and postpositions, like *upo "under, below". The postpositions became prepositions in most daughter languages. Other reconstructible particles include negators (*ne, *mē), conjunctions (*kʷe "and", *wē "or" and others) and an interjection (*wai!, an expression of woe or agony).
The syntax of the older Indo-European languages has been studied in earnest since at least the late nineteenth century, by such scholars as Hermann Hirt and Berthold Delbrück. In the second half of the twentieth century, interest in the topic increased and led to reconstructions of Proto-Indo-European syntax.
Since all the early attested IE languages were inflectional, PIE is thought to have relied primarily on morphological markers, rather than word order, to signal syntactic relationships within sentences. Still, a default (unmarked) word order is thought to have existed in PIE. This was reconstructed by Jacob Wackernagel as being subject–verb–object (SVO), based on evidence in Vedic Sanskrit, and the SVO hypothesis still has some adherents, but as of 2015 the "broad consensus" among PIE scholars is that PIE would have been a subject–object–verb (SOV) language.
The SOV default word order with other orders used to express emphasis (e.g., verb–subject–object to emphasise the verb) is attested in Old Indic, Old Iranian, Old Latin and Hittite, while traces of it can be found in the enclitic personal pronouns of the Tocharian languages. A shift from OV to VO order is posited to have occurred in late PIE since many of the descendant languages have this order: modern Greek, Romance and Albanian prefer SVO, Insular Celtic has VSO as the default order, and even the Anatolian languages show some signs of this word order shift. The inconsistent order preference in Baltic, Slavic and Germanic can be attributed to contact with outside OV languages.
Many hypothesised higher-level relationships between Proto-Indo-European and other language families have been proposed, but these are highly controversial. Among them:
The Ridley Scott film Prometheus features an android named "David" (played by Michael Fassbender) who learns Proto-Indo-European to communicate with the "Engineer", an extraterrestrial whose race may have created humans. David practices PIE by reciting Schleicher's fable and goes on to attempt communication with the Engineer through PIE. Linguist Dr Anil Biltoo created the film's reconstructed dialogue and had an onscreen role teaching David Schleicher's fable.