Linguistic typology

Linguistic typology is a field of linguistics that studies and classifies languages according to their structural and functional features. Its aim is to describe and explain the common properties and the structural diversity of the world's languages.[1] Its subdisciplines include, but are not limited to: qualitative typology, which deals with the issue of comparing languages and within-language variance; quantitative typology, which deals with the distribution of structural patterns in the world’s languages; theoretical typology, which explains these distributions; syntactic typology, which deals with word order, word form, word grammar and word choice; and lexical typology, which deals with language vocabulary.

Qualitative typology

Qualitative typology develops cross-linguistically viable notions or types that provide a framework for the description and comparison of individual languages. A few examples appear below.

Typological systems

Subject–verb–object positioning

One set of types reflects the basic order of subject, verb, and direct object in sentences:

These labels usually appear abbreviated as "SVO" and so forth, and may be called "typologies" of the languages to which they apply. The most commonly attested word orders are SOV and SVO while the least common orders are those that are object initial with OVS being the least common with only four attested instances.[2]

In the 1980s, linguists began to question the relevance of geographical distribution of different values for various features of linguistic structure. They may have wanted to discover whether a particular grammatical structure found in one language is likewise found in another language in the same geographic location.[3] Some languages split verbs into an auxiliary and an infinitive or participle, and put the subject and/or object between them. For instance, German (Ich habe einen Fuchs im Wald gesehen - *"I have a fox in-the woods seen"), Dutch (Hans vermoedde dat Jan Marie zag leren zwemmen - *"Hans suspected that Jan Marie saw to learn to swim") and Welsh (Mae'r gwirio sillafu wedi'i gwblhau - *"Is the checking spelling after its to complete"). In this case, linguists base the typology on the non-analytic tenses (i.e. those sentences in which the verb is not split) or on the position of the auxiliary. German is thus SVO in main clauses and Welsh is VSO (and preposition phrases would go after the infinitive).

Many typologists classify both German and Dutch as V2 languages, as the verb invariantly occurs as the second element of a full clause.

Some languages allow varying degrees of freedom in their constituent order, posing a problem for their classification within the subject–verb–object schema. Languages with bound case markings for nouns, for example, tend to have more flexible word orders than languages where case is defined by position within a sentence or presence of a preposition. To define a basic constituent order type in this case, one generally looks at frequency of different types in declarative affirmative main clauses in pragmatically neutral contexts, preferably with only old referents. Thus, for instance, Russian is widely considered an SVO language, as this is the most frequent constituent order under such conditions—all sorts of variations are possible, though, and occur in texts. In many inflected languages, such as Russian, Latin, and Greek, departures from the default word-orders are permissible but usually imply a shift in focus, an emphasis on the final element, or some special context. In the poetry of these languages, the word order may also shift freely to meet metrical demands. Additionally, freedom of word order may vary within the same language—for example, formal, literary, or archaizing varieties may have different, stricter, or more lenient constituent-order structures than an informal spoken variety of the same language.

On the other hand, when there is no clear preference under the described conditions, the language is considered to have "flexible constituent order" (a type unto itself).

An additional problem is that in languages without living speech communities, such as Latin, Ancient Greek, and Old Church Slavonic, linguists have only written evidence, perhaps written in a poetic, formalizing, or archaic style that mischaracterizes the actual daily use of the language. The daily spoken language of Sophocles or Cicero might have exhibited a different or much more regular syntax than their written legacy indicates.

Morphosyntactic alignment

Another common classification distinguishes nominative–accusative alignment patterns and ergative–absolutive ones. In a language with cases, the classification depends on whether the subject (S) of an intransitive verb has the same case as the agent (A) or the patient (P) of a transitive verb. If a language has no cases, but the word order is AVP or PVA, then a classification may reflect whether the subject of an intransitive verb appears on the same side as the agent or the patient of the transitive verb. Bickel (2011) has argued that alignment should be seen as a construction-specific property rather than a language-specific property.[1]

Many languages show mixed accusative and ergative behaviour (for example: ergative morphology marking the verb arguments, on top of an accusative syntax). Other languages (called "active languages") have two types of intransitive verbs—some of them ("active verbs") join the subject in the same case as the agent of a transitive verb, and the rest ("stative verbs") join the subject in the same case as the patient. Yet other languages behave ergatively only in some contexts (this "split ergativity" is often based on the grammatical person of the arguments or on the tense/aspect of the verb). For example, only some verbs in Georgian behave this way, and, as a rule, only while using the perfective (aorist).

Phonological systems

Linguistic typology also seeks to identify patterns in the structure and distribution of sound systems among the world's languages. This is accomplished by surveying and analyzing the relative frequencies of different phonological properties. These relative frequencies might, for example, be used to determine why contrastive voicing commonly occurs with plosives, as in English neat and need, but occurs much more rarely among fricatives, such as the English niece and knees. According to a worldwide sample of 637 languages,[4] 62% have the voicing contrast in stops but only 35% have this in fricatives. In the vast majority of those cases, the absence of voicing contrast occurs because there is a lack of voiced fricatives and because all languages have some form of plosive, but there are languages with no fricatives. Below is a chart showing the breakdown of voicing properties among languages in the aforementioned sample.

Plosive Voicing Fricative Voicing
Yes No Total
Yes 117 218 395 (62%)
No 44 198 242 (38%)
Total 221 (35%) 416 (65%) 637


Languages worldwide also vary in the number of sounds they use. These languages can go from very small phonemic inventories (Rotokas with six consonants and five vowels) to very large inventories (!Xóõ with 128 consonants and 28 vowels). An interesting phonological observation found with this data is that the larger a consonant inventory a language has, the more likely it is to contain a sound from a defined set of complex consonants (clicks, glottalized consonants, doubly articulated labial-velar stops, lateral fricatives and affricates, uvular and pharyngeal consonants, and dental or alveolar non-sibilant fricatives). Of this list, only about 26% of languages in a survey[4] of over 600 with small inventories (less than 19 consonants) contain a member of this set, while 51% of average languages (19-25) contain at least one member and 69% of large consonant inventories (greater than 25 consonants) contain a member of this set. It is then seen that complex consonants are in proportion to the size of the inventory.

Vowels contain a more modest number of phonemes, with the average being 5-6, which 51% of the languages in the survey have. About a third of the languages have larger than average vowel inventories. Most interesting though is the lack of relationship between consonant inventory size and vowel inventory size. Below is a chart showing this lack of predictability between consonant and vowel inventory sizes in relation to each other.

Consonant Inventory Vowel Quality Inventory
Small Average Large Total
Small 47 153 65 265 (39%)
Average 34 105 98 237 (35%)
Large 34 87 57 178 (26%)
Total 115 (17%) 345 (51%) 220 (32%) 680


Quantitative typology

Quantitative typology deals with the distribution and co-occurrence of structural patterns in the languages of the world. Major types of non-chance distribution include:

  • preferences (for instance, absolute and implicational universals, semantic maps, and hierarchies)
  • correlations (for instance, areal patterns, such as with a Sprachbund)

Linguistic universals are patterns that can be seen cross linguistically. Universals can either be absolute, meaning that every documented language exhibits this characteristic, or statistical, meaning that this characteristic is seen in most languages or is probable in most languages. Universals, both absolute and statistical can be unrestricted, meaning that they apply to most or all languages without any additional conditions. Conversely, both absolute and statistical universals can be restricted or implicational, meaning that a characteristic will be true on the condition of something else (if Y characteristic is true, then X characteristic is true).[5]



  1. ^ a b Bickel, B. "What is typology? - a short note" (PDF). (in German). Retrieved 2017-03-06.
  2. ^ Gell-Mann, Murray; Ruhlen, Merritt (2011-10-18). "The origin and evolution of word order". Proceedings of the National Academy of Sciences of the United States of America. 108 (42): 17290–17295. doi:10.1073/pnas.1113716108. ISSN 0027-8424. PMC 3198322. PMID 21987807.
  3. ^ Comrie, Bernard, et al. “Chapter Introduction.” WALS Online - Chapter Introduction, The World Atlas of Language Structures Online, 2013.
  4. ^ a b c d Song, J.J. (ed.) (2011). The Oxford Handbook of Linguistic Typology. Oxford: Oxford University Press. ISBN 978-0-19-928125-1.
  5. ^ Moravcsik, Edith (2013). Introducing Language Typology. Cambridge, London: Cambridge University Press. p. 9.


Analytic language

In linguistic typology, an analytic language is a language that primarily conveys relationships between words in sentences by way of helper words (particles, prepositions, etc.) and word order, as opposed to utilizing inflections (changing the form of a word to convey its role in the sentence). For example, the English-language phrase "The cat chases the ball" conveys the fact that the cat is acting on the ball analytically via word order. This can be contrasted to synthetic languages, which rely heavily on inflections to convey word relationships (e.g., the phrases "The cat chases the ball" and "The cat chased the ball" convey different time frames via changing the form of the word chase). Most languages are not purely analytic, but many rely primarily on analytic syntax.

Typically, analytic languages have a low morpheme-per-word ratio, especially with respect to inflectional morphemes. A grammatical construction can similarly be analytic if it uses unbound morphemes, which are separate words, and/or word order. Analytic languages rely more heavily on the use of definite and indefinite articles, which tend to be less prominently used or absent in strongly synthetic languages; stricter word order; various prepositions, postpositions, particles, and modifiers; and context.

Branching (linguistics)

In linguistics, branching refers to the shape of the parse trees that represent the structure of sentences. Assuming that the language is being written or transcribed from left to right, parse trees that grow down and to the right are right-branching, and parse trees that grow down and to the left are left-branching. The direction of branching reflects the position of heads in phrases, and in this regard, right-branching structures are head-initial, whereas left-branching structures are head-final. English has both right-branching (head-initial) and left-branching (head-final) structures, although it is more right-branching than left-branching. Some languages such as Japanese and Turkish are almost fully left-branching (head-final). Some languages are mostly right-branching (head-initial).

Dependent-marking language

A dependent-marking language has grammatical markers of agreement and case government between the words of phrases that tend to appear more on dependents than on heads. The distinction between head-marking and dependent-marking was first explored by Johanna Nichols in 1986, and has since become a central criterion in language typology in which languages are classified according to whether they are more head-marking or dependent-marking. Many languages employ both head and dependent-marking, but some employ double-marking, and yet others employ zero-marking. However, it is not clear that the head of a clause has anything to do with the head of a noun phrase, or even what the head of a clause is.

Head-marking language

A language is head-marking if the grammatical marks showing agreement between different words of a phrase tend to be placed on the heads (or nuclei) of phrases, rather than on the modifiers or dependents. Many languages employ both head-marking and dependent-marking, and some languages double up and are thus double-marking. The concept of head/dependent-marking was proposed by Johanna Nichols in 1986 and has come to be widely used as a basic category in linguistic typology.

Incorporation (linguistics)

Incorporation is a phenomenon by which a grammatical category, such as a verb, forms a compound with its direct object (object incorporation) or adverbial modifier, while retaining its original syntactic function. The inclusion of a noun qualifies the verb, narrowing its scope rather than making reference to a specific entity.

Incorporation is central to many polysynthetic languages such as those found in North America, Siberia and northern Australia. However, polysynthesis does not necessarily imply incorporation (Mithun 2009); neither does the presence of incorporation in a language imply that that language is polysynthetic.

Linguistic universal

A linguistic universal is a pattern that occurs systematically across natural languages, potentially true for all of them. For example, All languages have nouns and verbs, or If a language is spoken, it has consonants and vowels. Research in this area of linguistics is closely tied to the study of linguistic typology, and intends to reveal generalizations across languages, likely tied to cognition, perception, or other abilities of the mind. The field was largely pioneered by the linguist Joseph Greenberg, who derived a set of forty-five basic universals, mostly dealing with syntax, from a study of some thirty languages.

OV language

In linguistics, an OV language is a language in which the object comes before the verb. OV languages compose approximately forty-seven percent of documented languages.They are primarily left-branching, or head-final, with heads often found at the end of their phrases, with a resulting tendency to have the adjectives before nouns, to place adpositions after the noun phrases they govern (in other words, to use postpositions), to put relative clauses before their referents, and to place auxiliary verbs after the action verb. Of the OV languages that make use of affixes, many predominantly, or even exclusively, as in the case of Turkish, prefer suffixation to prefixation.

For example, English would be considered a VO language, and Japanese and Korean would be considered to be OV.

Japanese: Inu ga neko (object) o oikaketa (verb)

English: The dog chased (verb) the cat (object)Korean: "개는 고양이를 쫓았다. Gae-neun Go-yang-i-reul(object) jjo-chatt-da (verb)

English: The dog chased (verb) the cat (object)Some languages, such as Finnish, Hungarian, Russian, and Yiddish, use both OV and VO constructions, but in other instances, such as Early Middle English, some dialects may use VO and others OV. Languages that contain both OV and VO construction may solidify into one or the other construction. A language that moves the verb or verb phrase more than the object will have surface VO word order, and a language which moves the object more than the verb or verb phrase will have surface OV word order.

Oligoisolating language

An oligoisolating language (from the Greek ὀλίγος, meaning "few" or "little") is any language using very few morphemes which tend towards an isolating structure of statements. Oligoisolation is almost entirely theoretical and would necessitate long, modifier-heavy sentences.

No natural language has oligoisolating properties; nevertheless it is present in some constructed languages, such as Toki Pona and aUI.

Oligosynthetic language

An oligosynthetic language (from the Greek ὀλίγος, meaning "few" or "little") is any language using very few morphemes, perhaps only a hundred, which combine synthetically to form statements. Oligosynthesis is almost entirely theoretical and would depend heavily on the creation of lengthy compound words, to an extent far exceeding that of natural polysynthetic languages.

Because no natural language has been shown to exhibit oligosynthetic properties, some linguists regard true oligosynthesis as impossible or impractical for productive use by humans; its use is limited to some constructed languages, such as, Ygyde, Newspeak, Sona, Toki Pona and aUI. The Native American languages Nahuatl and Blackfoot have in the past been claimed to exhibit oligosynthetic qualities (most notably by Benjamin Whorf). However, the linguistic community has largely rejected these claims, preferring to categorize Nahuatl and Blackfoot as polysynthetic.

Polysynthetic language

In linguistic typology, polysynthetic languages are highly synthetic languages, i.e. languages in which words are composed of many morphemes (word parts that have independent meaning but may or may not be able to stand alone). They are very highly inflected languages. Polysynthetic languages typically have long "sentence-words" such as the Yupik word tuntussuqatarniksaitengqiggtuq which means "He had not yet said again that he was going to hunt reindeer." The word consists of the morphemes tuntu-ssur-qatar-ni-ksaite-ngqiggte-uq with the meanings, reindeer-hunt-future-say-negation-again-third person-singular-indicative; and except for the morpheme tuntu "reindeer", none of the other morphemes can appear in isolation.Whereas isolating languages have a low morpheme-to-word ratio, polysynthetic languages have a very high ratio. There is no generally agreed upon definition of polysynthesis. Generally polysynthetic languages have polypersonal agreement although some agglutinative languages that are not polysynthetic also have it, such as Basque, Hungarian and Georgian. Some authors apply the term polysynthetic to languages with high morpheme-to-word ratios, but others use it for languages that are highly head-marking, or those that frequently use noun incorporation.

Polysynthetic languages can be agglutinative or fusional depending on whether they encode one or multiple grammatical categories per affix.

At the same time, the question of whether to call a particular language polysynthetic is complicated by the fact that morpheme and word boundaries are not always clear cut, and languages may be highly synthetic in one area but less synthetic in other areas (e.g., verbs and nouns in Southern Athabaskan languages or Inuit languages). Many polysynthetic languages display complex evidentiality and/or mirativity systems in their verbs.

The term was invented by Peter Stephen Du Ponceau, who considered polysynthesis, as characterized by sentence words and noun incorporation, a defining feature of all Native American languages. This characterization was shown to be wrong, since many indigenous American languages are not polysynthetic, but it is a fact that polysynthetic languages are not evenly distributed throughout the world, but more frequent in the Americas, Australia, Siberia, and New Guinea; however, there are also examples in other areas. The concept became part of linguistic typology with the work of Edward Sapir, who used it as one of his basic typological categories. Recently, Mark C. Baker has suggested formally defining polysynthesis as a macro-parameter within Noam Chomsky's principles and parameters theory of grammar. Other linguists question the basic utility of the concept for typology since it covers many separate morphological types that have little else in common.

Research Centre for Linguistic Typology

The Research Centre for Linguistic Typology is a research institute founded in 1998 in the Australian National University by R. M. W. Dixon. It moved to LaTrobe University, Melbourne in 2000. It is an internationally recognized centre of research on fieldwork linguistics, Language documentation and linguistic typology.

Dixon and Alexandra Aikhenvald headed this institute up to 2008, when they accepted a position at James Cook University in Cairns. Since then, this institute is directed by Randy LaPolla.

Robert M. W. Dixon

Robert Malcolm Ward Dixon (Gloucester, England, 25 January 1939) is a Professor of Linguistics in the College of Arts, Society, and Education and The Cairns Institute, James Cook University, Queensland. He is also Deputy Director of The Language and Culture Research Centre at JCU. Doctor of Letters (DLitt, ANU, 1991), he was awarded a prestigious Honorary Doctor of Letters Honoris Causa by JCU in 2018. Fellow of British Academy; Fellow of the Australian Academy of the Humanities, and Honorary member of the Linguistic Society of America, he is one of three living linguists to be specifically mentioned in The Concise Oxford Dictionary of Linguistics by P. H. Matthews (Oxford: Oxford University Press, 2014).

Secundative language

A secundative language is a language in which the recipients of ditransitive verbs (which takes a subject and two objects: a theme and a recipient) are treated like the patients (targets) of monotransitive verbs (verbs that take only one object), and the themes get distinct marking. Secundative languages contrast with indirective languages, where the recipient is treated in a special way.

While English is mostly not a secundative language, there are some examples. The sentence John gave Mary the ball uses this construction, where the ball is the theme and Mary is the recipient.

Topic-prominent language

A topic-prominent language is a language that organizes its syntax to emphasize the topic–comment structure of the sentence. The term is best known in American linguistics from Charles N. Li and Sandra Thompson, who distinguished topic-prominent languages, such as Korean and Japanese, from subject-prominent languages, such as English.

In Li and Thompson's (1976) view, topic-prominent languages have morphology or syntax that highlights the distinction between the topic and the comment (what is said about the topic). Topic–comment structure may be independent of the syntactic ordering of subject, verb and object.

Tripartite language

A tripartite language, also called an ergative–accusative language, is one that treats the agent of a transitive verb, the patient of a transitive verb, and the single argument of an intransitive verb each in different ways. This contrasts with nominative–accusative and ergative–absolutive languages. If the language has morphological case, the arguments are marked in this way:

the agent of a transitive verb takes the ergative case

the object of a transitive verb takes the accusative case

the single argument of an intransitive verb takes the intransitive case

UCLA Phonological Segment Inventory Database

The UCLA Phonological Segment Inventory Database (or UPSID) is a statistical survey of the phoneme inventories in 451 of the world's languages. The database was created by American phonetician Ian Maddieson for the University of California, Los Angeles (UCLA) in 1984 and has been updated several times.

VO language

In linguistics, a VO language is a language in which the verb typically comes before the object, about 53% of documented languages.For example, Japanese would be considered an OV language, and English would be considered to be VO. A basic sentence demonstrating this would be as follows.

Japanese: Inu ga neko (object) o oikaketa (verb)

English: The dog chased (verb) the cat (object)Winfred P. Lehmann is the first to propose the reduction of the six possible permutations of word order to just two main ones, VO and OV, in what he calls the Fundamental Principle of Placement (FPP), arguing that the subject is not a primary element of a sentence. VO languages are primarily right-branching, or head-initial: heads are generally found at the beginning of their phrases.

VO languages have a tendency to favor the use of prepositions instead of postpositions, with only 42 using postpositions of the documented 498 VO languages.Some languages, such as Finnish, Hungarian, Russian, and Yiddish, use both VO and OV constructions, but in other instances, such as Early Middle English, some dialects may use VO and others OV. Languages that contain both OV and VO construction may solidify into one or the other construction. A language that moves the verb or verb phrase more than the object will have surface VO word order, and a language that moves the object more than the verb or verb phrase will have surface OV word order.

World Atlas of Language Structures

The World Atlas of Language Structures (WALS) is a database of structural (phonological, grammatical, lexical) properties of languages gathered from descriptive materials. It was first published by Oxford University Press as a book with CD-ROM in 2005, and was released as the second edition on the Internet in April 2008. It is maintained by the Max Planck Institute for Evolutionary Anthropology and by the Max Planck Digital Library. The editors are Martin Haspelmath, Matthew S. Dryer, David Gil and Bernard Comrie.The atlas provides information on the location, linguistic affiliation and basic typological features of a great number of the world's languages. It interacts with Google Maps. The information of the atlas is published under a Creative Commons license.

