Computational statistics, or statistical computing, is the interface between statistics and computer science. It is the area of computational science (or scientific computing) specific to the mathematical science of statistics. This area is also developing rapidly, leading to calls that a broader concept of computing should be taught as part of general statistical education.
As in traditional statistics the goal is to transform raw data into knowledge, but the focus lies on computer intensive statistical methods, such as cases with very large sample size and non-homogeneous data sets.
The terms 'computational statistics' and 'statistical computing' are often used interchangeably, although Carlo Lauro (a former president of the International Association for Statistical Computing) proposed making a distinction, defining 'statistical computing' as "the application of computer science to statistics", and 'computational statistics' as "aiming at the design of algorithm for implementing statistical methods on computers, including the ones unthinkable before the computer age (e.g. bootstrap, simulation), as well as to cope with analytically intractable problems" [sic].
The term 'Computational statistics' may also be used to refer to computationally intensive statistical methods including resampling methods, Markov chain Monte Carlo methods, local regression, kernel density estimation, artificial neural networks and generalized additive models.
Bootstrap aggregating, also called bagging, is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression. It also reduces variance and helps to avoid overfitting. Although it is usually applied to decision tree methods, it can be used with any type of method. Bagging is a special case of the model averaging approach.Jackknife resampling
In statistics, the jackknife is a resampling technique especially useful for variance and bias estimation. The jackknife pre-dates other common resampling methods such as the bootstrap. The jackknife estimator of a parameter is found by systematically leaving out each observation from a dataset and calculating the estimate and then finding the average of these calculations. Given a sample of size , the jackknife estimate is found by aggregating the estimates of each -sized sub-sample.
The jackknife technique was developed by Maurice Quenouille (1924-1973) from 1949, and refined in 1956. John Tukey expanded on the technique in 1958 and proposed the name "jackknife" since, like a physical jack-knife (a compact folding knife), it is a rough-and-ready tool that can improvise a solution for a variety of problems even though specific problems may be more efficiently solved with a purpose-designed tool.
The jackknife is a linear approximation of the bootstrap.Journal of Statistical Computation and Simulation
Journal of Statistical Computation and Simulation is a peer-reviewed scientific journal that publishes papers related to computational statistics. It is published by Taylor & Francis in English.
The journal started publishing in 1972. It publishes 12 issues each year.List of statistics journals
This is a list of scientific journals published in the field of statistics.Markov chain Monte Carlo
In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by observing the chain after a number of steps. The more steps there are, the more closely the distribution of the sample matches the actual desired distribution.Outline of statistics
Statistics is a field of inquiry that studies the collection, analysis, interpretation, and presentation of data. It is applicable to a wide variety of academic disciplines, from the physical and social sciences to the humanities; it is also used and misused for making informed decisions in all areas of business and government.Plug-in principle
In statistics, the plug-in principle is the method of estimation of functionals of a population distribution by evaluating the same functionals at the empirical distribution based on a sample.
For example, when estimating the population mean, this method uses the sample mean; to estimate the population median, it uses the sample median; to estimate the population regression line, it uses the sample regression line.
It is called a principle because it is too simple to be otherwise, it is just a guideline, not a theorem.Stan (software)
Stan is a probabilistic programming language for statistical inference written in C++. The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function.Stan is licensed under the New BSD License. Stan is named in honour of Stanislaw Ulam, pioneer of the Monte Carlo method.Stan was created by Andrew Gelman and Bob Carpenter, with a development team consisting of 34 members.Technometrics
Technometrics is a journal of statistics for the physical, chemical, and engineering sciences, published quarterly since 1959 by the American Society for Quality and the American Statistical Association.