Phonotactics (from Ancient Greek phōnḗ "voice, sound" and tacticós "having to do with arranging")[1] is a branch of phonology that deals with restrictions in a language on the permissible combinations of phonemes. Phonotactics defines permissible syllable structure, consonant clusters and vowel sequences by means of phonotactic constraints.

Phonotactic constraints are highly language specific. For example, in Japanese, consonant clusters like /st/ do not occur. Similarly, the clusters /kn/ and /ɡn/ are not permitted at the beginning of a word in Modern English but are in German and Dutch (in which the latter appears as /ɣn/) and were permitted in Old and Middle English. In contrast, in some Slavic languages /l/ and /r/ are used alongside vowels as syllable nuclei.

Syllables have the following internal segmental structure:

Both onset and coda may be empty, forming a vowel-only syllable, or alternatively, the nucleus can be occupied by a syllabic consonant. Phonotactics is known to affect second language vocabulary acquisition.[2]

English phonotactics

The English syllable (and word) twelfths /twɛlfθs/ is divided into the onset /tw/, the nucleus /ɛ/ and the coda /lfθs/; thus, it can be described as CCVCCCC (C = consonant, V = vowel). On this basis it is possible to form rules for which representations of phoneme classes may fill the cluster. For instance, English allows at most three consonants in an onset, but among native words under standard accents (and excluding a few obscure learned words such as sphragistics), phonemes in a three-consonantal onset are limited to the following scheme:[3]

/s/ + stop + approximant:
  • /s/ + /m/ + /j/
  • /s/ + /t/ + /ɹ/
  • /s/ + /t/ + /j/ (not in most accents of American English)
  • /s/ + /p/ + /j ɹ l/
  • /s/ + /k/ + /j ɹ l w/

This constraint can be observed in the pronunciation of the word blue: originally, the vowel of blue was identical to the vowel of cue, approximately [iw]. In most dialects of English, [iw] shifted to [juː]. Theoretically, this would produce *[bljuː]. The cluster [blj], however, infringes the constraint for three-consonantal onsets in English. Therefore, the pronunciation has been reduced to [bluː] by elision of the [j] in what is known as yod-dropping.

Not all languages have this constraint: compare Spanish pliegue [ˈpljeɣe] or French pluie [plɥi].

Constraints on English phonotactics include:[4]

  • All syllables have a nucleus
  • No geminate consonants
  • No onset /ŋ/
  • No /h/ in the syllable coda
  • No affricates or /h/ in complex onsets
  • The first consonant in a complex onset must be an obstruent (e.g. stop; combinations such as *ntat or *rkoop, with a sonorant, are not allowed)
  • The second consonant in a complex onset must not be a voiced obstruent (e.g. *zdop does not occur)
  • If the first consonant in a complex onset is not /s/, the second must be a liquid or a glide
  • Every subsequence contained within a sequence of consonants must obey all the relevant phonotactic rules (the substring principle rule)
  • No glides in syllable codas (excluding the offglides of diphthongs)
  • The second consonant in a complex coda must not be /r/, /ŋ/, /ʒ/, or /ð/ (compare "asthma", typically pronounced /ˈæzmə/ or /ˈæsmə/, but rarely /ˈæzðmə/)
  • If the second consonant in a complex coda is voiced, so is the first
  • An obstruent following /m/ or /ŋ/ in a coda must be voiced and homorganic with the nasal
  • Two obstruents in the same coda must share voicing (compare kids /kɪdz/ with kits /kɪts/)

Sonority Sequencing Principle

Segments of a syllable are universally distributed following what is called the Sonority Sequencing Principle (SSP), which states that, in any syllable, the nucleus has maximal sonority and that sonority decreases as you move away from the nucleus. Sonority is a measure of the amplitude of a speech sound. The particular ranking of each speech sound by sonority, called the sonority hierarchy, is language-specific, but, in its broad lines, hardly varies from a language to another,[5] which means all languages form their syllables in approximately the same way with regards to sonority.

To illustrate the SSP, the voiceless alveolar fricative [s] is lower on the sonority hierarchy than the alveolar lateral approximant [l], so the combination /sl/ is permitted in onsets and /ls/ is permitted in codas, but /ls/ is not allowed in onsets and /sl/ is not allowed in codas. Hence slips /slɪps/ and pulse /pʌls/ are possible English words while *lsips and *pusl are not.

The SSP expresses a very strong cross-linguistic tendency, however, it does not account for the patterns of all complex syllable margins. It may be violated in two ways: the first occurs when two segments in a margin have the same sonority, which is known as a sonority plateau. Such margins are found in a few languages, including English, as in the words sphinx and fact (though note that phsinx and fatc both violate English phonotactics).

The second instance of violation of the SSP is when a peripheral segment of a margin has a higher sonority than a segment closer to the nucleus. These margins are known as reversals and occur in some languages including English (steal [stiːɫ], bets /bɛts/) or French (dextre /dɛkstʁ/ but originally /dɛkstʁə/, strict /stʁikt/).[6]

Notes and references


  1. ^ φωνή, τακτικός. Liddell, Henry George; Scott, Robert; A Greek–English Lexicon at the Perseus Project
  2. ^ Laufer 1997.
  3. ^ Crystal, David (2003). The Cambridge Encyclopedia of the English Language. Cambridge University Press. p. 243. ISBN 978-0-521-53033-0.
  4. ^ Harley, Heidi (2003). English Words: A Linguistic Introduction. Wiley-Blackwell. pp. 58–69. ISBN 0631230327.
  5. ^ Jany, Carmen; Gordon, Matthew; Nash, Carlos M; Takara, Nobutaka (2007-01-01). "HOW UNIVERSAL IS THE SONORITY HIERARCHY?: A CROSS-LINGUISTIC ACOUSTIC STUDY". ResearchGate.
  6. ^ Carlisle, Robert S. (2001-06-01). "Syllable structure universals and second language acquisition". ResearchGate. 1 (1). doi:10.6018/ijes.1.1.47581. ISSN 1578-7044.


  • Bailey, Todd M. & Hahn, Ulrike. 2001. Determinants of wordlikeness: Phonotactics or lexical neighborhoods? Journal of Memory and Language 44: 568–591.
  • Coleman, John S. & Pierrehumbert, Janet. 1997. Stochastic phonological grammars and acceptability. Computational Phonology 3: 49–56.
  • Frisch, S.; Large, N. R.; & Pisoni, D. B. 2000. Perception of wordlikeness: Effects of segment probability and length on processing non-words. Journal of Memory and Language 42: 481–496.
  • Gathercole, Susan E. & Martin, Amanda J. 1996. Interactive processes in phonological memory. In Cognitive models of memory, edited by Susan E. Gathercole. Hove, UK: Psychology Press.
  • Hammond, Michael. 2004. Gradience, phonotactics, and the lexicon in English phonology. International Journal of English Studies 4: 1–24.
  • Gaygen, Daniel E. 1997. Effects of probabilistic phonotactics on the segmentation of continuous speech. Doctoral dissertation, University at Buffalo, Buffalo, NY.
  • Greenberg, Joseph H. & Jenkins, James J. 1964. Studies in the psychological correlates of the sound system of American English. Word 20: 157–177.
  • Laufer, B. (1997). "What's in a word that makes it hard or easy? Some intralexical factors that affect the learning of words". Vocabulary: Description, Acquisition and Pedagogy. Cambridge: Cambridge University Press. pp. 140–155. ISBN 9780521585514.
  • Luce, Paul A. & Pisoni, Daniel B. 1998. Recognizing spoken words: The neighborhood activation model. Ear and Hearing 19: 1–36.
  • Newman, Rochelle S.; Sawusch, James R.; & Luce, Paul A. 1996. Lexical neighborhood effects in phonetic processing. Journal of Experimental Psychology: Human Perception and Performance 23: 873–889.
  • Ohala, John J. & Ohala, M. 1986. Testing hypotheses regarding the psychological manifestation of morpheme structure constraints. In Experimental phonology, edited by John J. Ohala & Jeri J. Jaeger, 239–252. Orlando, FL: Academic Press.
  • Pitt, Mark A. & McQueen, James M. 1998. Is compensation for coarticulation mediated by the lexicon? Journal of Memory and Language 39: 347–370.
  • Storkel, Holly L. 2001. Learning new words: Phonotactic probability in language development. Journal of Speech, Language, and Hearing Research 44: 1321–1337.
  • Storkel, Holly L. 2003. Learning new words II: Phonotactic probability in verb learning. Journal of Speech, Language, and Hearing Research 46: 1312–1323.
  • Vitevitch, Michael S. & Luce, Paul A. 1998. When words compete: Levels of processing in perception of spoken words. Psychological Science 9: 325–329.
  • Vitevitch, Michael S. & Luce, Paul A. 1999. Probabilistic phonotactics and neighborhood activation in spoken word recognition. Journal of Memory and Language 40: 374–408.
  • Vitevitch, Michael S.; Luce, Paul A.; Charles-Luce, Jan; & Kemmerer, David. 1997. Phonotactics and syllable stress: Implications for the processing of spoken nonsense words. Language and Speech 40: 47–62.
  • Vitevitch, Michael S.; Luce, Paul A.; Pisoni, David B.; & Auer, Edward T. 1999. Phonotactics, neighborhood activation, and lexical access for spoken words. Brain and Language 68: 306–311.

External links

Apheresis (linguistics)

In phonetics and phonology, apheresis (; British English: aphaeresis) is the loss of one or more sounds from the beginning of a word, especially the loss of an unstressed vowel, thus producing a new form called an aphetism ().

Chru language

Chru (Vietnamese: Chu Ru) is a Chamic language of Vietnam spoken by the Chru people in southern Lâm Đồng Province (especially in Đơn Dương District) and in Bình Thuận Province.

Like the other Chamic languages spoken in Vietnam (Cham, Jarai, Rade and Roglai), use of Chru is declining as native speakers are generally bilingual in Vietnamese, which is used for most official or public settings, like schools.

Consonant cluster

In linguistics, a consonant cluster, consonant sequence or consonant compound is a group of consonants which have no intervening vowel. In English, for example, the groups /spl/ and /ts/ are consonant clusters in the word splits.

Some linguists argue that the term can be properly applied only to those consonant clusters that occur within one syllable. Others claim that the concept is more useful when it includes consonant sequences across syllable boundaries. According to the former definition, the longest consonant clusters in the word extra would be /ks/ and /tr/, whereas the latter allows /kstr/.

Edo language

Edo (with diacritics, Ẹ̀dó), also called Bini (Benin), is a Volta–Niger language spoken in Edo State, Nigeria. It is the primary native language of the Edo people and was the primary language of the Benin Empire and its predecessor, Igodomigodo.

English phonology

Like many other languages, English has wide variation in pronunciation, both historically and from dialect to dialect. In general, however, the regional dialects of English share a largely similar (but not identical) phonological system. Among other things, most dialects have vowel reduction in unstressed syllables and a complex set of phonological features that distinguish fortis and lenis consonants (stops, affricates, and fricatives). Most dialects of English preserve the consonant /w/ (spelled ⟨w⟩) and many preserve /θ, ð/ (spelled ⟨th⟩), while most other Germanic languages have shifted them to /v/ and /t, d/: compare English will (listen) and then (listen) with German will [vɪl] (listen) ('want') and denn [dɛn] (listen) ('because').

Phonological analysis of English often concentrates on or uses, as a reference point, one or more of the prestige or standard accents, such as Received Pronunciation for England, General American for the United States, and General Australian for Australia. Nevertheless, many other dialects of English are spoken, which have developed independently from these standardized accents, particularly regional dialects. Information about these standardized accents functions only as a limited guide to all of English phonology, which one can later expand upon once one becomes more familiar with some of the many other dialects of English that are spoken.


In phonology, epenthesis (; Greek ἐπένθεσις) means the addition of one or more sounds to a word, especially to the interior of a word (at the beginning prothesis and at the end paragoge are commonly used). The word epenthesis comes from epi- "in addition to" and en "in" and thesis "putting". Epenthesis may be divided into two types: excrescence, for the addition of a consonant, and anaptyxis () for the addition of a vowel. The opposite process where one or more sounds are removed is referred to as elision.

Finnish phonology

Unless otherwise noted, statements in this article refer to Standard Finnish, which is based on the dialect spoken in the former Häme Province in central south Finland. Standard Finnish is used by professional speakers, such as reporters and news presenters on television.

Linking and intrusive R

Linking R and intrusive R are sandhi or linking phenomena involving the appearance of the rhotic consonant (which normally corresponds to the letter ⟨r⟩) between two consecutive morphemes where it would not normally be pronounced. These phenomena occur in many non-rhotic varieties of English, such as those in most of England and Wales, part of the United States, and all of the Anglophone societies of the southern hemisphere, with the exception of South Africa. These phenomena first appeared in English sometime after the year 1700.


A pseudoword or non-word is a unit of speech or text that appears to be an actual word in a certain language, while in fact it has no meaning in the lexicon. It is a kind of non-lexical vocable. Such word is composed of a combination of letters that can be pronounced and it conforms to English spelling rules.Such words without a meaning in a certain language or no occurrence in any text corpus or dictionary can be the result of (the interpretation of) a truly random signal, there will usually be an underlying deterministic source as is the case for:

nonsense words (e.g. jabberwocky)

nonce words

ghost words (e.g. dord)


typosWhen nonsensical words are strung together, gibberish may arise. Word salad in contrast may contain legible and intelligible words but without semantic or syntactic correlation or coherence.

Qimant language

The Qimant language is a highly endangered language spoken by a small and elderly fraction of the Qemant people in northern Ethiopia, mainly in the Chilga woreda in Semien Gondar Zone between Gondar and Metemma.

Siar-Lak language

Siar, also known as Lak, Lamassa, or Likkilikki, is an Austronesian language spoken in New Ireland Province in the southern island point of Papua New Guinea. Lak is in the Patpatar-Tolai sub-group, which then falls under the New Ireland-Tolai group in the Western Oceanic language, a sub-group within the Austronesian family. The Siar people keep themselves sustained and nourished by fishing and gardening. The native people call their language ep warfare anon dat, which means «our language».

Sonority hierarchy

A sonority hierarchy or sonority scale is a ranking of speech sounds (or phones) by amplitude. For example, pronouncing the vowel [a], will produce a much louder sound than the stop [t]. Sonority hierarchies are especially important when analyzing syllable structure; rules about what segments may appear in onsets or codas together, such as SSP, are formulated in terms of the difference of their sonority values. Some languages also have assimilation rules based on sonority hierarchy, for example, the Finnish potential mood, in which a less sonorous segment changes to copy a more sonorous adjacent segment (e.g. -tne- → -nne-).


Syllabification () or syllabication () is the separation of a word into syllables, whether spoken or written.

Telefol language

Telefol is a language spoken by the Telefol people in Papua New Guinea, notable for possessing a base-27 numeral system.

Tifal language

Tifal is an Ok language spoken in Papua New Guinea.

Tobati language

Tobati, or Yotafa, is an Austronesian language spoken in Jayapura Bay in Papua province, Indonesia. It was once thought to be a Papuan language. Notably, Tobati displays a very rare object-subject-verb word order.

Vitu language

Vitu (also spelled Witu or Vittu) or Muduapa is an Oceanic language spoken by about 7,000 people on the islands northwest of the coast of West New Britain in Papua New Guinea.

Wandamen language

Wamesa is an Austronesian language of Indonesian New Guinea, spoken across the neck of the Doberai Peninsula or Bird's Head. The language is often called Wandamen in the literature; however, several speakers of the Windesi dialect have stated that 'Wandamen' and 'Wondama' refer to a dialect spoken around the Wondama Bay, studied by early missionaries and linguists from SIL. They affirm that the language as a whole is called 'Wamesa', the dialects of which are Windesi, Bintuni, and Wandamen. While Wamesa is spoken in West Papua, Wamesa is not a Papuan language but rather a SHWNG language.

Wamesa is one of the approximately 750 languages of Indonesia. There are currently 5,000-8,000 speakers of Wamesa. While it was historically used as a lingua franca, it is currently considered to be an under-documented, endangered language. This means that fewer and fewer children have an active command of Wamesa. Instead, Papuan Malay has become increasingly dominant in the area.

Wogamusin language

Wogamusin is a Papuan language found in four villages in the Ambunti District of East Sepik Province, Papua New Guinea. It was spoken by about 700 people in 1998.

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.