Historical linguistics

Historical linguistics, also called diachronic linguistics, is the scientific study of language change over time.[1] Principal concerns of historical linguistics include:[2]

  1. to describe and account for observed changes in particular languages
  2. to reconstruct the pre-history of languages and to determine their relatedness, grouping them into language families (comparative linguistics)
  3. to develop general theories about how and why language changes
  4. to describe the history of speech communities
  5. to study the history of words, i.e. etymology

History and development

Western modern historical linguistics dates from the late 18th century. It grew out of the earlier discipline of philology,[3] the study of ancient texts and documents dating back to antiquity.

At first, historical linguistics served as the cornerstone of comparative linguistics primarily as a tool for linguistic reconstruction.[4] Scholars were concerned chiefly with establishing language families and reconstructing prehistoric proto-languages, using the comparative method and internal reconstruction.[4] The focus was initially on the well-known Indo-European languages, many of which had long written histories; the scholars also studied the Uralic languages, another European language family for which less early written material exists. Since then, there has been significant comparative linguistic work expanding outside of European languages as well, such as on the Austronesian languages and various families of Native American languages, among many others. Comparative linguistics is now, however, only a part of a more broadly conceived discipline of historical linguistics. For the Indo-European languages, comparative study is now a highly specialized field. Most research is being carried out on the subsequent development of these languages, in particular, the development of the modern standard varieties.

Some scholars have undertaken studies attempting to establish super-families, linking, for example, Indo-European, Uralic, and other families into Nostratic. These attempts have not been accepted widely. The information necessary to establish relatedness becomes less available as the time depth is increased. The time-depth of linguistic methods is limited due to chance word resemblances and variations between language groups, but a limit of around 10,000 years is often assumed.[5] The dating of the various proto-languages is also difficult; several methods are available for dating, but only approximate results can be obtained.

Diachronic and synchronic analysis

Initially, all modern linguistics was historical in orientation. Even the study of modern dialects involved looking at their origins. Ferdinand de Saussure's distinction between synchronic and diachronic linguistics is fundamental to the present day organization of the discipline. Primacy is accorded to synchronic linguistics, and diachronic linguistics is defined as the study of successive synchronic stages. Saussure's clear demarcation, however, has had both defenders and critics.

In linguistics, a synchronic analysis is one that views linguistic phenomena only at a given time, usually the present, though a synchronic analysis of a historical language form is also possible. This may be distinguished from diachronic, which regards a phenomenon in terms of developments through time. Diachronic analysis is the main concern of historical linguistics; however, most other branches of linguistics are concerned with some form of synchronic analysis. The study of language change offers a valuable insight into the state of linguistic representation, and because all synchronic forms are the result of historically evolving diachronic changes, the ability to explain linguistic constructions necessitates a focus on diachronic processes.[6]

In practice, a purely synchronic linguistics is not possible for any period before the invention of the gramophone, as written records always lag behind speech in reflecting linguistic developments. Written records are difficult to date accurately before the development of the modern title page. Often dating must rely on contextual historical evidence such as inscriptions, or, modern technology such as carbon dating can be used to ascertain dates of varying accuracy. Also, the work of sociolinguists on linguistic variation has shown synchronic states are not uniform: the speech habits of older and younger speakers differ in ways that point to language change. Synchronic variation is linguistic change in progress.

Synchronic and diachronic approaches can reach quite different conclusions. For example, a Germanic strong verb like English sing – sang – sung is irregular when viewed synchronically: the native speaker's brain processes these as learned forms, whereas the derived forms of regular verbs are processed quite differently, by the application of productive rules (for example, adding -ed to the basic form of a verb as in walk – walked). This is an insight of psycholinguistics, relevant also for language didactics, both of which are synchronic disciplines. However, a diachronic analysis will show that the strong verb is the remnant of a fully regular system of internal vowel changes, in this case, namely, the Indo-European ablaut; historical linguistics seldom uses the category "irregular verb".

The principal tools of research in diachronic linguistics are the comparative method and the method of internal reconstruction. Less-standard techniques, such as mass lexical comparison, are used by some linguists to overcome the limitations of the comparative method, but most linguists regard them as unreliable.

The findings of historical linguistics are often used as a basis for hypotheses about the groupings and movements of peoples, particularly in the prehistoric period. In practice, however, it is often unclear how to integrate the linguistic evidence with the archaeological or genetic evidence. For example, there are numerous theories concerning the homeland and early movements of the Proto-Indo-Europeans, each with its own interpretation of the archaeological record.

Sub-fields of study

Classification of Indo-European languages. Red: Extinct languages. White: categories or unattested proto-languages. Left half: centum languages; right half: satem languages

Comparative linguistics

Comparative linguistics (originally comparative philology) is a branch of historical linguistics that is concerned with comparing languages in order to establish their historical relatedness. Languages may be related by convergence through borrowing or by genetic descent, thus languages can change and are also able to cross-relate.

Genetic relatedness implies a common origin or proto-language. Comparative linguistics has the goal of constructing language families, reconstructing proto-languages, and specifying the changes that have resulted in the documented languages. To maintain a clear distinction between attested language and reconstructed forms, comparative linguists prefix an asterisk to any form that is not found in surviving texts.


Etymology is the study of the history of words: when they entered a language, from what source, and how their form and meaning have changed over time. A word may enter a language as a loanword (as a word from one language adopted by speakers of another language), through derivational morphology by combining pre-existing elements in the language, by a hybrid of these two processes called phono-semantic matching, or in several other minor ways.

In languages with a long and detailed history, etymology makes use of philology, the study of how words change from culture to culture over time. Etymologists also apply the methods of comparative linguistics to reconstruct information about languages that are too old for any direct information (such as writing) to be known. By analyzing related languages with a technique known as the comparative method, linguists can make inferences, about their shared parent language and its vocabulary. In that way, word roots that can be traced all the way back to the origin of, for instance, the Indo-European language family have been found. Although originating in the philological tradition, much current etymological research is done in language families for which little or no early documentation is available, such as Uralic and Austronesian.


Dialectology is the scientific study of linguistic dialect, the varieties of a language that are characteristic of particular groups, based primarily on geographic distribution and their associated features. This is in contrast to variations based on social factors, which are studied in sociolinguistics, or variations based on time, which are studied in historical linguistics. Dialectology treats such topics as divergence of two local dialects from a common ancestor and synchronic variation.

Dialectologists are concerned with grammatical features that correspond to regional areas. Thus, they are usually dealing with populations living in specific locales for generations without moving, but also with immigrant groups bringing their languages to new settlements.


Phonology is a sub-field of linguistics which studies the sound system of a specific language or set of languages. Whereas phonetics is about the physical production and perception of the sounds of speech, phonology describes the way sounds function within a given language or across languages.

An important part of phonology is studying which sounds are distinctive units within a language. For example, the "p" in "pin" is aspirated, but the "p" in "spin" is not. In English these two sounds are used in complementary distribution and are not used to differentiate words so they are considered allophones of the same phoneme. In some other languages like Thai and Quechua, the same difference of aspiration or non-aspiration differentiates words and so the two sounds (or phones) are therefore considered phonemes.

In addition to the minimal meaningful sounds (the phonemes), phonology studies how sounds alternate, such as the /p/ in English, and topics such as syllable structure, stress, accent, and intonation.

The principles of phonological theory have also been applied to the analysis of sign languages, but the phonological units do not consist of sounds. The principles of phonological analysis can be applied independently of modality because they are designed to serve as general analytical tools, not language-specific ones.


Morphology is the study of the formal means of expression in a language; in the context of historical linguistics, how the formal means of expression change over time; for instance, languages with complex inflectional systems tend to be subject to a simplification process. This field studies the internal structure of words as a formal means of expression.[7]

Words as units in the lexicon are the subject matter of lexicology. While words are generally accepted as being (with clitics) the smallest units of syntax, it is clear that, in most (if not all) languages, words can be related to other words by rules. The rules understood by the speaker reflect specific patterns (or regularities) in the way words are formed from smaller units and how those smaller units interact in speech. In this way, morphology is the branch of linguistics that studies patterns of word-formation within and across languages, and attempts to formulate rules that model the knowledge of the speakers of those languages, in the context of historical linguistics, how the means of expression change over time. See grammaticalisation.


Syntax is the study of the principles and rules for constructing sentences in natural languages. The term syntax is used to refer directly to the rules and principles that govern the sentence structure of any individual language, as in "the syntax of Modern Irish". Modern researchers in syntax attempt to describe languages in terms of such rules. Many professionals in this discipline attempt to find general rules that apply to all natural languages in the context of historical linguistics, how characteristics of sentence structure in related languages changed over time. See grammaticalisation.

Rates of change and varieties of adaptation

Studies in historical linguistics often use the terms "conservative" or "innovative" to characterize the extent of change occurring in a particular language or dialect as compared with related varieties. In particular, a conservative variety changes relatively less than an innovative variety. These variations in plasticity are often related to the socio-economic situation of the language speakers. An example of an innovative language would be the American English language because of the vast number of speakers and the open interaction these speakers have with other language groups; these changes can be seen in the terms developed for business and marketing, among other fields such as technology. The converse of the innovative language is the conservative language, and these are generally defined by their static nature and imperviousness to outside influences. Most of these languages are spoken in secluded areas that lack any other primary language speaking population, however this is not a guarantee. These descriptive terms carry no value judgment in linguistic studies, and are not used to determine any form of worthiness a language has compared to any other language.

A particularly conservative variety that preserves features that have long since vanished elsewhere is sometimes said to be "archaic". While there are few examples of archaic language in modern society, some have survived in set phrases or in nursery rhymes.

Evolutionary context

In terms of evolutionary theory, historical linguistics (as opposed to research into the origins of human language) studies Lamarckian acquired characteristics of languages.[8]

See also

An asterisk (*); from Late Latin asteriscus, from Ancient Greek ἀστερίσκος, asteriskos, "little star", is a typographical symbol or glyph. It is so called because it resembles a conventional image of a star.

Computer scientists and mathematicians often vocalize it as star (as, for example, in the A* search algorithm or C*-algebra). In English, an asterisk is usually five-pointed in sans-serif typefaces, six-pointed in serif typefaces, and six- or eight-pointed when handwritten. It is often used to censor offensive words, and on the Internet, to indicate a correction to a previous message.

In computer science, the asterisk is commonly used as a wildcard character, or to denote pointers, repetition, or multiplication.


In linguistics, cognates are words that have a common etymological origin. Cognates are often inherited from a shared parent language, but they may also involve borrowings from some other language. For example, the English words dish and desk and the German word Tisch ("table") are cognates because they all come from Latin discus, which relates to their flat surfaces. Cognates may have evolved similar, different or even opposite meanings, but in most cases there are some similar sounds or letters in the words. Some words sound similar, but don't come from the same root; these are called false cognates.

The word cognate derives from the Latin noun cognatus, which means "blood relative".

Comparative linguistics

Comparative linguistics (originally comparative philology) is a branch of historical linguistics that is concerned with comparing languages to establish their historical relatedness.

Genetic relatedness implies a common origin or proto-language and comparative linguistics aims to construct language families, to reconstruct proto-languages and specify the changes that have resulted in the documented languages. To maintain a clear distinction between attested and reconstructed forms, comparative linguists prefix an asterisk to any form that is not found in surviving texts. A number of methods for carrying out language classification have been developed, ranging from simple inspection to computerised hypothesis testing. Such methods have gone through a long process of development.

Compensatory lengthening

Compensatory lengthening in phonology and historical linguistics is the lengthening of a vowel sound that happens upon the loss of a following consonant, usually in the syllable coda, or of a vowel in an adjacent syllable. Lengthening triggered by consonant loss may be considered an extreme form of fusion (Crowley 1997:46). Both types may arise from speakers' attempts to preserve a word's moraic count.

Dené–Yeniseian languages

Dené–Yeniseian is a proposed language family consisting of the Yeniseian languages of central Siberia and the Na-Dené languages of northwestern North America.

Reception among experts has been largely, though not universally, favorable; thus, Dené–Yeniseian has been called "the first demonstration of a genealogical link between Old World and New World language families that meets the standards of traditional comparative-historical linguistics".

False cognate

False cognates are pairs of words that seem to be cognates because of similar sounds and meaning, but have different etymologies; they can be within the same language or from different languages. For example, the English word dog and the Mbabaram word dog have exactly the same meaning, but by complete coincidence. Likewise, English much and Spanish mucho which came by their similar meanings via completely different origins. This is different from false friends, which are similar-sounding words with different meanings, but which may in fact be etymologically related. (For example: Spanish dependiente looks like dependent, but means sales assistant or clerk as well.)

Even though false cognates lack a common root, there may still be an indirect connection between them (for example by phono-semantic matching or folk etymology).


An isogloss, also called a heterogloss (see Etymology below), is the geographic boundary of a certain linguistic feature, such as the pronunciation of a vowel, the meaning of a word, or the use of some morphological or syntactic feature. Major dialects are typically demarcated by bundles of isoglosses, such as the Benrath line that distinguishes High German from the other West Germanic languages and the La Spezia–Rimini Line that divides the Northern Italian dialects from Central Italian dialects. However, an individual isogloss may or may not have any coincidence with a language border. For example, the front-rounding of /y/ cuts across France and Germany, while the /y/ is absent from Italian and Spanish words that are cognates with the /y/-containing French words.

One of the best-known isoglosses is the centum-satem isogloss.

Similar to an isogloss, an isograph is a distinguishing feature of a writing system. Both concepts are also used in historical linguistics.

Japhetic theory

In linguistics, the Japhetic theory of Soviet linguist Nikolay Yakovlevich Marr (1864–1934) postulated that the Kartvelian languages of the Caucasus area are related to the Semitic languages of the Middle East. The theory gained favor among Soviet linguists for ideological reasons, as it was thought to represent "proletarian science" as opposed to "bourgeois science".

Language change

Language change is variation over time in a language's phonological, morphological, semantic, syntactic, and other features. It is studied by historical linguistics, sociolinguistics, and evolutionary linguistics. Some commentators use the label corruption to suggest that language change constitutes a degradation in the quality of a language, especially when the change originates from human error or prescriptively discouraged usage. Modern linguistics typically does not support this concept, since from a scientific point of view such innovations cannot be judged in terms of good or bad. John Lyons notes that "any standard of evaluation applied to language-change must be based upon a recognition of the various functions a language 'is called upon' to fulfil in the society which uses it".

Language family

A language family is a group of languages related through descent from a common ancestral language or parental language, called the proto-language of that family. The term "family" reflects the tree model of language origination in historical linguistics, which makes use of a metaphor comparing languages to people in a biological family tree, or in a subsequent modification, to species in a phylogenetic tree of evolutionary taxonomy. Linguists therefore describe the daughter languages within a language family as being genetically related.According to Ethnologue the 7,097 living human languages are distributed in 141 different language families. A "living language" is simply one that is used as the primary form of communication of a group of people. There are also many dead and extinct languages, as well as some that are still insufficiently studied to be classified, or are even unknown outside their respective speech communities.

Membership of languages in a language family is established by comparative linguistics. Sister languages are said to have a "genetic" or "genealogical" relationship. The latter term is older. Speakers of a language family belong to a common speech community. The divergence of a proto-language into daughter languages typically occurs through geographical separation, with the original speech community gradually evolving into distinct linguistic units. Individuals belonging to other speech communities may also adopt languages from a different language family through the language shift process.Genealogically related languages present shared retentions; that is, features of the proto-language (or reflexes of such features) that cannot be explained by chance or borrowing (convergence). Membership in a branch or group within a language family is established by shared innovations; that is, common features of those languages that are not found in the common ancestor of the entire family. For example, Germanic languages are "Germanic" in that they share vocabulary and grammatical features that are not believed to have been present in the Proto-Indo-European language. These features are believed to be innovations that took place in Proto-Germanic, a descendant of Proto-Indo-European that was the source of all Germanic languages.

Linguistic reconstruction

Linguistic reconstruction is the practice of establishing the features of an unattested ancestor language of one or more given languages. There are two kinds of reconstruction:

Internal reconstruction uses irregularities in a single language to make inferences about an earlier stage of that language – that is, it is based on evidence from that language alone.

Comparative reconstruction, usually referred to just as reconstruction, establishes features of the ancestor of two or more related languages, belonging to the same language family, by means of the comparative method. A language reconstructed in this way is often referred to as a proto-language (the common ancestor of all the languages in a given family); examples include Proto-Indo-European, Proto-Dravidian.Texts discussing linguistic reconstruction commonly preface reconstructed forms with an asterisk (*) to distinguish them from attested forms.

An attested word from which a root in the proto-language is reconstructed is a reflex. More generally, a reflex is the known derivative of an earlier form, which may be either attested or reconstructed. Reflexes of the same source are cognates.

Linkage (linguistics)

In historical linguistics, a linkage is a group of related languages that is formed when a proto-language breaks up into a network of dialects that gradually differentiates into separate languages.The term was introduced by Malcolm Ross in his study of Western Oceanic languages (Ross 1988). It is contrasted with a family, which arises when the proto-language speech community separates into groups that are isolated from each other, rather than forming a network.

List of linguists

A linguist in the academic sense is a person who studies natural language (an academic discipline known as linguistics). Ambiguously, the word is sometimes also used to refer to a polyglot (one who knows several languages), or a grammarian (a scholar of grammar), but these two uses of the word are distinct (and one does not have to be a polyglot in order to be an academic linguist). The following is a list of linguists in the academic sense.


A loanword (also loan word or loan-word) is a word adopted from one language (the donor language) and incorporated into another language without translation. This is in contrast to cognates, which are words in two or more languages that are similar because they share an etymological origin, and calques, which involve translation.


In phonetics, nasalization (or nasalisation) is the production of a sound while the velum is lowered, so that some air escapes through the nose during the production of the sound by the mouth. An archetypal nasal sound is [n].

In the International Phonetic Alphabet, nasalization is indicated by printing a tilde diacritic U+0303 ◌̃ COMBINING TILDE (HTML ̃) above the symbol for the sound to be nasalized: [ã] is the nasalized equivalent of [a], and [ṽ] is the nasalized equivalent of [v]. A subscript diacritic [ą], called an ogonek or nosinė, is sometimes seen, especially when the vowel bears tone marks that would interfere with the superscript tilde. For example, [ą̄ ą́ ą̀ ą̂ ą̌] are more legible in most fonts than [ã̄ ã́ ã̀ ã̂ ã̌].


Philology is the study of language in oral and written historical sources; it is the intersection between textual criticism, literary criticism, history, and linguistics. Philology is more commonly defined as the study of literary texts as well as oral and written records, the establishment of their authenticity and their original form, and the determination of their meaning. A person who pursues this kind of study is known as a philologist.

In older usage, especially British, philology is more general, covering comparative and historical linguistics.Classical philology studies classical languages. Classical philology principally originated from the Library of Pergamum and the Library of Alexandria around the fourth century BCE, continued by Greeks and Romans throughout the Roman/Byzantine Empire. It was preserved and promoted during the Islamic Golden Age, and eventually resumed by European scholars of the Renaissance, where it was soon joined by philologies of other non-Asian (European) (Germanic, Celtic), Eurasian (Slavistics, etc.) and Asian (Arabic, Persian, Sanskrit, Chinese, etc.) languages. Indo-European studies involves the comparative philology of all Indo-European languages.

Philology, with its focus on historical development (diachronic analysis), is contrasted with linguistics due to Ferdinand de Saussure's insistence on the importance of synchronic analysis. The contrast continued with the emergence of structuralism and Chomskyan linguistics alongside its emphasis on syntax.


A proto-language, in the tree model of historical linguistics, is a language, usually hypothetical or reconstructed, and usually unattested, from which a number of attested known languages are believed to have descended by evolution, forming a language family.

In the strict sense, a proto-language is the most recent common ancestor of a language family, immediately before the family started to diverge into the attested daughter languages. It is therefore equivalent with the ancestral language or parental language of a language family.Moreover, a group of idioms (such as a dialect cluster) which are not considered separate languages (for whichever reasons) can also be described as descending from a unitary proto-language.

Occasionally, the German term Ursprache (from Ur- "primordial" and Sprache "language", pronounced [ˈuːɐ̯ʃpʁaːxə]) is used instead.


A relict is a surviving remnant of a natural phenomenon.

In biology a relict (or relic) is an organism that at an earlier time was abundant in a large area but now occurs at only one or a few small areas.

In ecology, an ecosystem which originally ranged over a large expanse, but is now narrowly confined, may be termed a relict.

In geology, a relict is a structure or mineral from a parent rock that did not undergo metamorphosis when the surrounding rock did, or a rock that survived a destructive geologic process.

In geomorphology, a relict landform is a landform formed by either erosive or constructive surficial processes that are no longer active as they were in the past.

In agronomy, a relict crop is a crop which was previously grown extensively, but is now only used in one limited region, or a small number of isolated regions.

In history (as revealed in DNA testing), a relict population is an ancient people in an area who have been largely supplanted by a later group of migrants and their descendants.

In real estate law, reliction is the gradual recession of water from its usual high-water mark so that the newly uncovered land becomes the property of the adjoining riparian property owner.Other uses:

In addition, relict was an ancient term still used in colonial (British) America, and in England of that era, but now archaic, for a widow; it has come to be a generic or collective term for widows and widowers.

In historical linguistics, a relict is a word that is a survivor of a form or forms that are otherwise archaic.

Semantic change

Semantic change (also semantic shift, semantic progression, semantic development, or semantic drift) is a form of language change regarding the evolution of word usage—usually to the point that the modern meaning is radically different from the original usage. In diachronic (or historical) linguistics, semantic change is a change in one of the meanings of a word. Every word has a variety of senses and connotations, which can be added, removed, or altered over time, often to the extent that cognates across space and time have very different meanings. The study of semantic change can be seen as part of etymology, onomasiology, semasiology, and semantics.

