In probability, and statistics, a **multivariate random variable** or **random vector** is a list of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value. The individual variables in a random vector are grouped together because they are all part of a single mathematical system — often they represent different properties of an individual statistical unit. For example, while a given person has a specific age, height and weight, the representation of these features of *an unspecified person* from within a group would be a random vector. Normally each element of a random vector is a real number.

Random vectors are often used as the underlying implementation of various types of aggregate random variables, e.g. a random matrix, random tree, random sequence, stochastic process, etc.

More formally, a multivariate random variable is a column vector (or its transpose, which is a row vector) whose components are scalar-valued random variables on the same probability space as each other, , where is the sample space, is the sigma-algebra (the collection of all events), and is the probability measure (a function returning each event's probability).

Every random vector gives rise to a probability measure on with the Borel algebra as the underlying sigma-algebra. This measure is also known as the joint probability distribution, the joint distribution, or the multivariate distribution of the random vector.

The distributions of each of the component random variables are called marginal distributions. The conditional probability distribution of given is the probability distribution of when is known to be a particular value.

Random vectors can be subjected to the same kinds of algebraic operations as can non-random vectors: addition, subtraction, multiplication by a scalar, and the taking of inner products.

Similarly, a new random vector can be defined by applying an affine transformation to a random vector :

- , where is an matrix and is an column vector.

If is an invertible matrix and has a probability density function , then the probability density of is

- .

More generally we can study invertible mappings of random vectors.^{[1]}^{:p.290–291}

Let be a one-to-one mapping from an open subset of onto a subset of , let have continuous partial derivatives in and let the Jacobian determinant of be zero at no point of . Assume that the real random vector has a probability density function and satisfies . Then the random vector is of probability density

where denotes the indicator function and set denotes support of .

The expected value or mean of a random vector is a fixed vector whose elements are the expected values of the respective random variables.

The covariance matrix (also called the variance-covariance matrix) of an random vector is an matrix whose (*i,j*)^{th} element is the covariance between the *i* ^{th} and the *j* ^{th} random variables. The covariance matrix is the expected value, element by element, of the matrix computed as , where the superscript T refers to the transpose of the indicated vector:

By extension, the cross-covariance matrix between two random vectors and ( having elements and having elements) is the matrix

where again the matrix expectation is taken element-by-element in the matrix. Here the (*i,j*)^{th} element is the covariance between the *i* ^{th} element of and the *j* ^{th} element of The cross-covariance matrix is simply the transpose of the matrix .

One can take the expectation of a quadratic form in the random vector *X* as follows:^{[2]}^{:p.170–171}

where *C* is the covariance matrix of *X* and tr refers to the trace of a matrix — that is, to the sum of the elements on its main diagonal (from upper left to lower right). Since the quadratic form is a scalar, so is its expectation.

**Proof**: Let be an random vector with and and let be an non-stochastic matrix.

Then based on the formula for the covariance, if we denote and (where henceforth a prime sign denotes a transpose), we see that:

Hence

which leaves us to show that

This is true based on the fact that one can cyclically permute matrices when taking a trace without changing the end result (e.g.: tr(*AB*) = tr(*BA*)).

We see that

And since

is a scalar, then

trivially. Using the permutation we get:

and by plugging this into the original formula we get:

One can take the expectation of the product of two different quadratic forms in a zero-mean Gaussian random vector *X* as follows:^{[2]}^{:pp. 162–176}

where again *C* is the covariance matrix of *X*. Again, since both quadratic forms are scalars and hence their product is a scalar, the expectation of their product is also a scalar.

In portfolio theory in finance, an objective often is to choose a portfolio of risky assets such that the distribution of the random portfolio return has desirable properties. For example, one might want to choose the portfolio return having the lowest variance for a given expected value. Here the random vector is the vector of random returns on the individual assets, and the portfolio return *p* (a random scalar) is the inner product of the vector of random returns with a vector *w* of portfolio weights — the fractions of the portfolio placed in the respective assets. Since *p* = *w*^{T}, the expected value of the portfolio return is *w*^{T}E() and the variance of the portfolio return can be shown to be *w*^{T}C*w*, where C is the covariance matrix of .

In linear regression theory, we have data on *n* observations on a dependent variable *y* and *n* observations on each of *k* independent variables *x _{j}*. The observations on the dependent variable are stacked into a column vector

where β is a postulated fixed but unknown vector of *k* response coefficients, and *e* is an unknown random vector reflecting random influences on the dependent variable. By some chosen technique such as ordinary least squares, a vector is chosen as an estimate of β, and the estimate of the vector *e*, denoted , is computed as

Then the statistician must analyze the properties of and , which are viewed as random vectors since a randomly different selection of *n* cases to observe would have resulted in different values for them.

The evolution of a *k*×1 random vector through time can be modelled as a vector autoregression (VAR) as follows:

where the *i*-periods-back vector observation is called the *i*-th lag of , *c* is a *k* × 1 vector of constants (intercepts), *A _{i}* is a time-invariant

This page is based on a Wikipedia article written by authors
(here).

Text is available under the CC BY-SA 3.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.