Mode (statistics)

The mode of a set of data values is the value that appears most often.[1] It is the value x at which its probability mass function takes its maximum value. In other words, it is the value that is most likely to be sampled.

Like the statistical mean and median, the mode is a way of expressing, in a (usually) single number, important information about a random variable or a population. The numerical value of the mode is the same as that of the mean and median in a normal distribution, and it may be very different in highly skewed distributions.

The mode is not necessarily unique to a given discrete distribution, since the probability mass function may take the same maximum value at several points x1, x2, etc. The most extreme case occurs in uniform distributions, where all values occur equally frequently.

When the probability density function of a continuous distribution has multiple local maxima it is common to refer to all of the local maxima as modes of the distribution. Such a continuous distribution is called multimodal (as opposed to unimodal). A mode of a continuous probability distribution is often considered to be any value x at which its probability density function has a locally maximum value, so any peak is a mode.[2]

In symmetric unimodal distributions, such as the normal distribution, the mean (if defined), median and mode all coincide. For samples, if it is known that they are drawn from a symmetric unimodal distribution, the sample mean can be used as an estimate of the population mode.

Mode of a sample

The mode of a sample is the element that occurs most often in the collection. For example, the mode of the sample [1, 3, 6, 6, 6, 6, 7, 7, 12, 12, 17] is 6. Given the list of data [1, 1, 2, 4, 4] the mode is not unique – the dataset may be said to be bimodal, while a set with more than two modes may be described as multimodal.

For a sample from a continuous distribution, such as [0.935..., 1.211..., 2.430..., 3.668..., 3.874...], the concept is unusable in its raw form, since no two values will be exactly the same, so each value will occur precisely once. In order to estimate the mode of the underlying distribution, the usual practice is to discretize the data by assigning frequency values to intervals of equal distance, as for making a histogram, effectively replacing the values by the midpoints of the intervals they are assigned to. The mode is then the value where the histogram reaches its peak. For small or middle-sized samples the outcome of this procedure is sensitive to the choice of interval width if chosen too narrow or too wide; typically one should have a sizable fraction of the data concentrated in a relatively small number of intervals (5 to 10), while the fraction of the data falling outside these intervals is also sizable. An alternate approach is kernel density estimation, which essentially blurs point samples to produce a continuous estimate of the probability density function which can provide an estimate of the mode.

The following MATLAB (or Octave) code example computes the mode of a sample:

X = sort(x);
indices   =  find(diff([X; realmax]) > 0); % indices where repeated values change
[modeL,i] =  max (diff([0; indices]));     % longest persistence length of repeated values
mode      =  X(indices(i));

The algorithm requires as a first step to sort the sample in ascending order. It then computes the discrete derivative of the sorted list, and finds the indices where this derivative is positive. Next it computes the discrete derivative of this set of indices, locating the maximum of this derivative of indices, and finally evaluates the sorted sample at the point where that maximum occurs, which corresponds to the last member of the stretch of repeated values.

Comparison of mean, median and mode

Visualisation mode median mean
Geometric visualisation of the mode, median and mean of an arbitrary probability density function.[3]
Comparison of common averages of values { 1, 2, 2, 3, 4, 7, 9 }
Type Description Example Result
Arithmetic mean Sum of values of a data set divided by number of values: (1+2+2+3+4+7+9) / 7 4
Median Middle value separating the greater and lesser halves of a data set 1, 2, 2, 3, 4, 7, 9 3
Mode Most frequent value in a data set 1, 2, 2, 3, 4, 7, 9 2

Use

Unlike mean and median, the concept of mode also makes sense for "nominal data" (i.e., not consisting of numerical values in the case of mean, or even of ordered values in the case of median). For example, taking a sample of Korean family names, one might find that "Kim" occurs more often than any other name. Then "Kim" would be the mode of the sample. In any voting system where a plurality determines victory, a single modal value determines the victor, while a multi-modal outcome would require some tie-breaking procedure to take place.

Unlike median, the concept of mode makes sense for any random variable assuming values from a vector space, including the real numbers (a one-dimensional vector space) and the integers (which can be considered embedded in the reals). For example, a distribution of points in the plane will typically have a mean and a mode, but the concept of median does not apply. The median makes sense when there is a linear order on the possible values. Generalizations of the concept of median to higher-dimensional spaces are the geometric median and the centerpoint.

Uniqueness and definedness

For some probability distributions, the expected value may be infinite or undefined, but if defined, it is unique. The mean of a (finite) sample is always defined. The median is the value such that the fractions not exceeding it and not falling below it are each at least 1/2. It is not necessarily unique, but never infinite or totally undefined. For a data sample it is the "halfway" value when the list of values is ordered in increasing value, where usually for a list of even length the numerical average is taken of the two values closest to "halfway". Finally, as said before, the mode is not necessarily unique. Certain pathological distributions (for example, the Cantor distribution) have no defined mode at all. For a finite data sample, the mode is one (or more) of the values in the sample.

Properties

Assuming definedness, and for simplicity uniqueness, the following are some of the most interesting properties.

  • All three measures have the following property: If the random variable (or each value from the sample) is subjected to the linear or affine transformation which replaces X by aX+b, so are the mean, median and mode.
  • Except for extremely small samples, the mode is insensitive to "outliers" (such as occasional, rare, false experimental readings). The median is also very robust in the presence of outliers, while the mean is rather sensitive.
  • In continuous unimodal distributions the median often lies between the mean and the mode, about one third of the way going from mean to mode. In a formula, median ≈ (2 × mean + mode)/3. This rule, due to Karl Pearson, often applies to slightly non-symmetric distributions that resemble a normal distribution, but it is not always true and in general the three statistics can appear in any order.[4][5]
  • For unimodal distributions, the mode is within standard deviations of the mean, and the root mean square deviation about the mode is between the standard deviation and twice the standard deviation.[6]

Example for a skewed distribution

An example of a skewed distribution is personal wealth: Few people are very rich, but among those some are extremely rich. However, many are rather poor.

Comparison mean median mode
Comparison of mean, median and mode of two log-normal distributions with different skewness.

A well-known class of distributions that can be arbitrarily skewed is given by the log-normal distribution. It is obtained by transforming a random variable X having a normal distribution into random variable Y = eX. Then the logarithm of random variable Y is normally distributed, hence the name.

Taking the mean μ of X to be 0, the median of Y will be 1, independent of the standard deviation σ of X. This is so because X has a symmetric distribution, so its median is also 0. The transformation from X to Y is monotonic, and so we find the median e0 = 1 for Y.

When X has standard deviation σ = 0.25, the distribution of Y is weakly skewed. Using formulas for the log-normal distribution, we find:

Indeed, the median is about one third on the way from mean to mode.

When X has a larger standard deviation, σ = 1, the distribution of Y is strongly skewed. Now

Here, Pearson's rule of thumb fails.

Van Zwet condition

Van Zwet derived an inequality which provides sufficient conditions for this inequality to hold.[7] The inequality

Mode ≤ Median ≤ Mean

holds if

F( Median - x ) + F( Median + x ) ≥ 1

for all x where F() is the cumulative distribution function of the distribution.

Unimodal distributions

It can be shown for a unimodal distribution that the median and the mean lie within (3/5)1/2 ≈ 0.7746 standard deviations of each other.[8] In symbols,

where is the absolute value.

A similar relation holds between the median and the mode: they lie within 31/2 ≈ 1.732 standard deviations of each other:

History

The term mode originates with Karl Pearson in 1895.[9]

See also

References

  1. ^ Damodar N. Gujarati f Econometrics. McGraw-Hill Irwin. 3rd edition, 2006: p. 110.probability distribution]]
  2. ^ Zhang, C; Mapes, BE; Soden, BJ (2003). "Bimodality in tropical water vapour". Q. J. R. Meteorol. Soc. 129: 2847–2866. doi:10.1256/qj.02.16.
  3. ^ "AP Statistics Review - Density Curves and the Normal Distributions". Retrieved 16 March 2015.
  4. ^ "Relationship between the mean, median, mode, and standard deviation in a unimodal distribution".
  5. ^ Hippel, Paul T. von (2005). "Mean, Median, and Skew: Correcting a Textbook Rule". Journal of Statistics Education. 13 (2). doi:10.1080/10691898.2005.11910556.
  6. ^ Bottomley, H. (2004). "Maximum distance between the mode and the mean of a unimodal distribution" (PDF). Unpublished preprint.
  7. ^ van Zwet, WR (1979). "Mean, median, mode II". Statistica Neerlandica. 33 (1): 1–5. doi:10.1111/j.1467-9574.1979.tb00657.x.
  8. ^ Basu, Sanjib, and Anirban DasGupta. "The mean, median, and mode of unimodal distributions: a characterization." Theory of Probability & Its Applications 41.2 (1997): 210-223.
  9. ^ Pearson, Karl (1895). "Contributions to the Mathematical Theory of Evolution. II. Skew Variation in Homogeneous Material". Philosophical Transactions of the Royal Society of London A. 186: 343–414. doi:10.1098/rsta.1895.0010.

External links

Arg max

In mathematics, the arguments of the maxima (abbreviated arg max or argmax) are the points of the domain of some function at which the function values are maximized. In contrast to global maxima, referring to the largest outputs of a function, arg max refers to the inputs, or arguments, at which the function outputs are as large as possible.

Kinetic theory of gases

The kinetic theory of gases describes a gas as a large number of submicroscopic particles (atoms or molecules), all of which are in constant, rapid, random motion. The randomness arises from the particles' many collisions with each other and with the walls of the container.

Kinetic theory of gases explains the macroscopic properties of gases, such as pressure, temperature, viscosity, thermal conductivity, and volume, by considering their molecular composition and motion. The theory posits that gas pressure results from particles' collisions with the walls of a container at different velocities.

Kinetic molecular theory defines temperature in its own way, in contrast with the thermodynamic definition.Under an optical microscope, the molecules making up a liquid are too small to be visible. However, the jittery motion of pollen grains or dust particles in liquid are visible. Known as Brownian motion, the motion of the pollen or dust results from their collisions with the liquid's molecules.

Mean

There are several kinds of means in various branches of mathematics (especially statistics).

For a data set, the arithmetic mean, also called the mathematical expectation or average, is the central value of a discrete set of numbers: specifically, the sum of the values divided by the number of values. The arithmetic mean of a set of numbers x1, x2, ..., xn is typically denoted by , pronounced "x bar". If the data set were based on a series of observations obtained by sampling from a statistical population, the arithmetic mean is the sample mean (denoted ) to distinguish it from the mean of the underlying distribution, the population mean (denoted or ). Pronounced "mew" /'mjuː/.

In probability and statistics, the population mean, or expected value, are a measure of the central tendency either of a probability distribution or of the random variable characterized by that distribution. In the case of a discrete probability distribution of a random variable X, the mean is equal to the sum over every possible value weighted by the probability of that value; that is, it is computed by taking the product of each possible value x of X and its probability p(x), and then adding all these products together, giving . An analogous formula applies to the case of a continuous probability distribution. Not every probability distribution has a defined mean; see the Cauchy distribution for an example. Moreover, for some distributions the mean is infinite.

For a finite population, the population mean of a property is equal to the arithmetic mean of the given property while considering every member of the population. For example, the population mean height is equal to the sum of the heights of every individual divided by the total number of individuals. The sample mean may differ from the population mean, especially for small samples. The law of large numbers dictates that the larger the size of the sample, the more likely it is that the sample mean will be close to the population mean.

Outside probability and statistics, a wide range of other notions of "mean" are often used in geometry and analysis; examples are given below.

Modus

Modus may refer to:

Modus, the Latin name for grammatical mood, in linguistics

Modus, the Latin name for mode (statistics)

Modus (company), an Alberta-based company

Modus (medieval music), a term used in several different technical meanings in medieval music theory

The Renault Modus, a small car

Modus (band), a pop music band in former Czechoslovakia

Modus (album), 1979

Modus FX, a visual effects company based in Sainte-Thérèse, Quebec, Canada

Modus (TV series), a Swedish television series, 2015

Nuclear fission

In nuclear physics and nuclear chemistry, nuclear fission is a nuclear reaction or a radioactive decay process in which the nucleus of an atom splits into smaller, lighter nuclei. The fission process often produces free neutrons and gamma photons, and releases a very large amount of energy even by the energetic standards of radioactive decay.

Nuclear fission of heavy elements was discovered on December 17, 1938 by German Otto Hahn and his assistant Fritz Strassmann, and explained theoretically in January 1939 by Lise Meitner and her nephew Otto Robert Frisch. Frisch named the process by analogy with biological fission of living cells. For heavy nuclides, it is an exothermic reaction which can release large amounts of energy both as electromagnetic radiation and as kinetic energy of the fragments (heating the bulk material where fission takes place). In order for fission to produce energy, the total binding energy of the resulting elements must be more negative (greater binding energy) than that of the starting element.

Fission is a form of nuclear transmutation because the resulting fragments are not the same element as the original atom. The two nuclei produced are most often of comparable but slightly different sizes, typically with a mass ratio of products of about 3 to 2, for common fissile isotopes. Most fissions are binary fissions (producing two charged fragments), but occasionally (2 to 4 times per 1000 events), three positively charged fragments are produced, in a ternary fission. The smallest of these fragments in ternary processes ranges in size from a proton to an argon nucleus.

Apart from fission induced by a neutron, harnessed and exploited by humans, a natural form of spontaneous radioactive decay (not requiring a neutron) is also referred to as fission, and occurs especially in very high-mass-number isotopes. Spontaneous fission was discovered in 1940 by Flyorov, Petrzhak and Kurchatov in Moscow, when they decided to confirm that, without bombardment by neutrons, the fission rate of uranium was indeed negligible, as predicted by Niels Bohr; it was not.The unpredictable composition of the products (which vary in a broad probabilistic and somewhat chaotic manner) distinguishes fission from purely quantum-tunneling processes such as proton emission, alpha decay, and cluster decay, which give the same products each time. Nuclear fission produces energy for nuclear power and drives the explosion of nuclear weapons. Both uses are possible because certain substances called nuclear fuels undergo fission when struck by fission neutrons, and in turn emit neutrons when they break apart. This makes a self-sustaining nuclear chain reaction possible, releasing energy at a controlled rate in a nuclear reactor or at a very rapid, uncontrolled rate in a nuclear weapon.

The amount of free energy contained in nuclear fuel is millions of times the amount of free energy contained in a similar mass of chemical fuel such as gasoline, making nuclear fission a very dense source of energy. The products of nuclear fission, however, are on average far more radioactive than the heavy elements which are normally fissioned as fuel, and remain so for significant amounts of time, giving rise to a nuclear waste problem. Concerns over nuclear waste accumulation and over the destructive potential of nuclear weapons are a counterbalance to the peaceful desire to use fission as an energy source.

Random laser

A Random Laser (RL) is a laser in which optical feedback is provided by scattering particles. As in conventional lasers, a gain medium is required for optical amplification. However, opposite to Fabry-Perot cavities and Distributed FeedBack lasers, neither reflective surfaces nor distributed periodic structures are used in RLs, as light is confined in an active region by diffusive elements that either can or cannot be spatially distributed inside the gain medium.

Random lasing has been reported from a large variety of materials, e.g. colloidal solutions of dye and scattering particles, semiconductor powders, optical fibers and polymers. Due to the output emission with low spatial coherence and laser-like energy convertion efficiency, RLs are attractive devices for energy efficient illumination applications.

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.