Geostatistics is a branch of statistics focusing on spatial or spatiotemporal datasets. Developed originally to predict probability distributions of ore grades for mining operations, it is currently applied in diverse disciplines including petroleum geology, hydrogeology, hydrology, meteorology, oceanography, geochemistry, geometallurgy, geography, forestry, environmental control, landscape ecology, soil science, and agriculture (esp. in precision farming). Geostatistics is applied in varied branches of geography, particularly those involving the spread of diseases (epidemiology), the practice of commerce and military planning (logistics), and the development of efficient spatial networks. Geostatistical algorithms are incorporated in many places, including geographic information systems (GIS) and the R statistical environment.
Geostatistics is intimately related to interpolation methods, but extends far beyond simple interpolation problems. Geostatistical techniques rely on statistical models that are based on random function (or random variable) theory to model the uncertainty associated with spatial estimation and simulation.
A number of simpler interpolation methods/algorithms, such as inverse distance weighting, bilinear interpolation and nearest-neighbor interpolation, were already well known before geostatistics. Geostatistics goes beyond the interpolation problem by considering the studied phenomenon at unknown locations as a set of correlated random variables.
Let Z(x) be the value of the variable of interest at a certain location x. This value is unknown (e.g. temperature, rainfall, piezometric level, geological facies, etc.). Although there exists a value at location x that could be measured, geostatistics considers this value as random since it was not measured, or has not been measured yet. However, the randomness of Z(x) is not complete, but defined by a cumulative distribution function (CDF) that depends on certain information that is known about the value Z(x):
Typically, if the value of Z is known at locations close to x (or in the neighborhood of x) one can constrain the CDF of Z(x) by this neighborhood: if a high spatial continuity is assumed, Z(x) can only have values similar to the ones found in the neighborhood. Conversely, in the absence of spatial continuity Z(x) can take any value. The spatial continuity of the random variables is described by a model of spatial continuity that can be either a parametric function in the case of variogram-based geostatistics, or have a non-parametric form when using other methods such as multiple-point simulation or pseudo-genetic techniques.
By applying a single spatial model on an entire domain, one makes the assumption that Z is a stationary process. It means that the same statistical properties are applicable on the entire domain. Several geostatistical methods provide ways of relaxing this stationarity assumption.
In this framework, one can distinguish two modeling goals:
A number of methods exist for both geostatistical estimation and multiple realizations approaches. Several reference books provide a comprehensive overview of the discipline.
Kriging is a group of geostatistical techniques to interpolate the value of a random field (e.g., the elevation, z, of the landscape as a function of the geographic location) at an unobserved location from observations of its value at nearby locations.
Multiple-indicator kriging (MIK) is a recent advance on other techniques for mineral deposit modeling and resource block model estimation, such as ordinary kriging. Initially, MIK showed considerable promise as a new method that could more accurately estimate overall global mineral deposit concentrations or grades.
In probability theory and statistics, covariance is a measure of how much two variables change together, and the covariance function, or kernel, describes the spatial or temporal covariance of a random variable process or field. For a random field or stochastic process Z(x) on a domain D, a covariance function C(x, y) gives the covariance of the values of the random field at the two locations x and y:
The same C(x, y) is called the autocovariance function in two instances: in time series (to denote exactly the same concept except that x and y refer to locations in time rather than in space), and in multivariate random fields (to refer to the covariance of a variable with itself, as opposed to the cross covariance between two different variables at different locations, Cov(Z(x1), Y(x2))).Geography
Geography (from Greek: γεωγραφία, geographia, literally "earth description") is a field of science devoted to the study of the lands, features, inhabitants, and phenomena of the Earth and planets. The first person to use the word γεωγραφία was Eratosthenes (276–194 BC). Geography is an all-encompassing discipline that seeks an understanding of Earth and its human and natural complexities—not merely where objects are, but also how they have changed and come to be.
Geography is often defined in terms of two branches: human geography and physical geography. Human geography deals with the study of people and their communities, cultures, economies, and interactions with the environment by studying their relations with and across space and place. Physical geography deals with the study of processes and patterns in the natural environment like the atmosphere, hydrosphere, biosphere, and geosphere.
The four historical traditions in geographical research are: spatial analyses of natural and the human phenomena, area studies of places and regions, studies of human-land relationships, and the Earth sciences. Geography has been called "the world discipline" and "the bridge between the human and the physical sciences".Geologic modelling
Geologic modelling, geological modelling or geomodelling is the applied science of creating computerized representations of portions of the Earth's crust based on geophysical and geological observations made on and below the Earth surface. A geomodel is the numerical equivalent of a three-dimensional geological map complemented by a description of physical quantities in the domain of interest.
Geomodelling is related to the concept of Shared Earth Model;
which is a multidisciplinary, interoperable and updatable knowledge base about the subsurface.
Geomodelling is commonly used for managing natural resources, identifying natural hazards, and quantifying geological processes, with main applications to oil and gas fields, groundwater aquifers and ore deposits. For example, in the oil and gas industry, realistic geologic models are required as input to reservoir simulator programs, which predict the behavior of the rocks under various hydrocarbon recovery scenarios. A reservoir can only be developed and produced once; therefore, making a mistake by selecting a site with poor conditions for development is tragic and wasteful. Using geological models and reservoir simulation allows reservoir engineers to identify which recovery options offer the safest and most economic, efficient, and effective development plan for a particular reservoir.
Geologic modelling is a relatively recent subdiscipline of geology which integrates structural geology, sedimentology, stratigraphy, paleoclimatology, and diagenesis;
In 2-dimensions (2D), a geologic formation or unit is represented by a polygon, which can be bounded by faults, unconformities or by its lateral extent, or crop. In geological models a geological unit is bounded by 3-dimensional (3D) triangulated or gridded surfaces. The equivalent to the mapped polygon is the fully enclosed geological unit, using a triangulated mesh. For the purpose of property or fluid modelling these volumes can be separated further into an array of cells, often referred to as voxels (volumetric elements). These 3D grids are the equivalent to 2D grids used to express properties of single surfaces.
Geomodelling generally involves the following steps:
Preliminary analysis of geological context of the domain of study.
Interpretation of available data and observations as point sets or polygonal lines (e.g. "fault sticks" corresponding to faults on a vertical seismic section).
Construction of a structural model describing the main rock boundaries (horizons, unconformities, intrusions, faults)
Definition of a three-dimensional mesh honoring the structural model to support volumetric representation of heterogeneity (see Geostatistics) and solving the Partial Differential Equations which govern physical processes in the subsurface (e.g. seismic wave propagation, fluid transport in porous media).Geometallurgy
Geometallurgy relates to the practice of combining geology or geostatistics with metallurgy, or, more specifically, extractive metallurgy, to create a spatially or geologically based predictive model for mineral processing plants. It is used in the hard rock mining industry for risk management and mitigation during mineral processing plant design. It is also used, to a lesser extent, for production planning in more variable ore deposits.
There are four important components or steps to developing a geometallurgical program,:
the geologically informed selection of a number of ore samples
laboratory-scale test work to determine the ore's response to mineral processing unit operations
the distribution of these parameters throughout the orebody using an accepted geostatistical technique
the application of a mining sequence plan and mineral processing models to generate a prediction of the process plant behaviorGeorges Matheron
Georges François Paul Marie Matheron (December 2, 1930 – August 7, 2000) was a French mathematician and geologist, known as the founder of geostatistics and a co-founder (together with Jean Serra) of mathematical morphology. In 1968, he created the Centre de Géostatistique et de Morphologie Mathématique at the Paris School of Mines in Fontainebleau. He is known for his contributions on Kriging and mathematical morphology. His seminal work is posted for study and review to the Online Library of the Centre de Géostatistique, Fontainebleau, France.Geotargeting
Geotargeting in geomarketing and internet marketing is the method of determining the geolocation of a website visitor and delivering different content to that visitor based on their location. This includes country, region/state, city, metro code/zip code, organization, IP address, ISP or other criteria. A common usage of geo targeting is found in online advertising, as well as internet television with sites such as iPlayer and Hulu. In these circumstances, content is often restricted to users geolocated in specific countries; this approach serves as a means of implementing digital rights management. Use of proxy servers and virtual private networks may give a false location.Index of geography articles
This page is a list of geography topics.
Geography is the study of the world and of the distribution of life on the earth, including human life and the effects of human activity. Geography research addresses both the questions of where, as well as why, geographical phenomena occur. Geography is a diverse field that seeks to understand the world and all of its human and natural complexities—not merely where objects are, but how they came to be, and how they have changed since then.Inverse distance weighting
Inverse distance weighting (IDW) is a type of deterministic method for multivariate interpolation with a known scattered set of points. The assigned values to unknown points are calculated with a weighted average of the values available at the known points.
The name given to this type of methods was motivated by the weighted average applied, since it resorts to the inverse of the distance to each known point ("amount of proximity") when assigning weights.Kernel method
In machine learning, kernel methods are a class of algorithms for pattern analysis, whose best known member is the support vector machine (SVM). The general task of pattern analysis is to find and study general types of relations (for example clusters, rankings, principal components, correlations, classifications) in datasets. In its simplest form, the kernel trick means transforming data into another dimension that has a clear dividing margin between classes of data. For many algorithms that solve these tasks, the data in raw representation have to be explicitly transformed into feature vector representations via a user-specified feature map: in contrast, kernel methods require only a user-specified kernel, i.e., a similarity function over pairs of data points in raw representation.
Kernel methods owe their name to the use of kernel functions, which enable them to operate in a high-dimensional, implicit feature space without ever computing the coordinates of the data in that space, but rather by simply computing the inner products between the images of all pairs of data in the feature space. This operation is often computationally cheaper than the explicit computation of the coordinates. This approach is called the "kernel trick". Kernel functions have been introduced for sequence data, graphs, text, images, as well as vectors.
Algorithms capable of operating with kernels include the kernel perceptron, support vector machines (SVM), Gaussian processes, principal components analysis (PCA), canonical correlation analysis, ridge regression, spectral clustering, linear adaptive filters and many others. Any linear model can be turned into a non-linear model by applying the kernel trick to the model: replacing its features (predictors) by a kernel function.
Most kernel algorithms are based on convex optimization or eigenproblems and are statistically well-founded. Typically, their statistical properties are analyzed using statistical learning theory (for example, using Rademacher complexity).Kriging
In statistics, originally in geostatistics, kriging or Gaussian process regression is a method of interpolation for which the interpolated values are modeled by a Gaussian process governed by prior covariances. Under suitable assumptions on the priors, kriging gives the best linear unbiased prediction of the intermediate values. Interpolating methods based on other criteria such as smoothness (e.g., smoothing spline) need not yield the most likely intermediate values. The method is widely used in the domain of spatial analysis and computer experiments. The technique is also known as Wiener–Kolmogorov prediction, after Norbert Wiener and Andrey Kolmogorov.
The theoretical basis for the method was developed by the French mathematician Georges Matheron in 1960, based on the Master's thesis of Danie G. Krige, the pioneering plotter of distance-weighted average gold grades at the Witwatersrand reef complex in South Africa. Krige sought to estimate the most likely distribution of gold based on samples from a few boreholes. The English verb is to krige and the most common noun is kriging; both are often pronounced with a hard "g", following the pronunciation of the name "Krige". The word is sometimes capitalized as Kriging in the literature.List of fields of application of statistics
Statistics is the mathematical science involving the collection, analysis and interpretation of data. A number of specialties have evolved to apply statistical and methods to various disciplines. Certain topics have "statistical" in their name but relate to manipulations of probability distributions rather than to statistical analysis.
Actuarial science is the discipline that applies mathematical and statistical methods to assess risk in the insurance and finance industries.
Astrostatistics is the discipline that applies statistical analysis to the understanding of astronomical data.
Biostatistics is a branch of biology that studies biological phenomena and observations by means of statistical analysis, and includes medical statistics.
Business analytics is a rapidly developing business process that applies statistical methods to data sets (often very large) to develop new insights and understanding of business performance & opportunities
Chemometrics is the science of relating measurements made on a chemical system or process to the state of the system via application of mathematical or statistical methods.
Demography is the statistical study of all populations. It can be a very general science that can be applied to any kind of dynamic population, that is, one that changes over time or space.
Econometrics is a branch of economics that applies statistical methods to the empirical study of economic theories and relationships.
Environmental statistics is the application of statistical methods to environmental science. Weather, climate, air and water quality are included, as are studies of plant and animal populations.
Epidemiology is the study of factors affecting the health and illness of populations, and serves as the foundation and logic of interventions made in the interest of public health and preventive medicine.
Geostatistics is a branch of geography that deals with the analysis of data from disciplines such as petroleum geology, hydrogeology, hydrology, meteorology, oceanography, geochemistry, geography.
Machine learning is the subfield of computer science that formulates algorithms in order to make predictions from data.
Operations research (or operational research) is an interdisciplinary branch of applied mathematics and formal science that uses methods such as mathematical modeling, statistics, and algorithms to arrive at optimal or near optimal solutions to complex problems.
Population ecology is a sub-field of ecology that deals with the dynamics of species populations and how these populations interact with the environment.
Psychometrics is the theory and technique of educational and psychological measurement of knowledge, abilities, attitudes, and personality traits.
Quality control reviews the factors involved in manufacturing and production; it can make use of statistical sampling of product items to aid decisions in process control or in accepting deliveries.
Quantitative psychology is the science of statistically explaining and changing mental processes and behaviors in humans.
Reliability engineering is the study of the ability of a system or component to perform its required functions under stated conditions for a specified period of time
Statistical finance, an area of econophysics, is an empirical attempt to shift finance from its normative roots to a positivist framework using exemplars from statistical physics with an emphasis on emergent or collective properties of financial markets.
Statistical mechanics is the application of probability theory, which includes mathematical tools for dealing with large populations, to the field of mechanics, which is concerned with the motion of particles or objects when subjected to a force.
Statistical physics is one of the fundamental theories of physics, and uses methods of probability theory in solving physical problems.
Statistical signal processing utilizes the statistical properties of signals to perform signal processing tasks.
Statistical thermodynamics is the study of the microscopic behaviors of thermodynamic systems using probability theory and provides a molecular level interpretation of thermodynamic quantities such as work, heat, free energy, and entropy.Markov chain geostatistics
Markov chain geostatistics uses Markov chain spatial models, simulation algorithms and associated spatial correlation measures (e.g., transiogram) based on the Markov chain random field theory, which extends a single Markov chain into a multi-dimensional random field for geostatistical modeling. A Markov chain random field is still a single spatial Markov chain. The spatial Markov chain moves or jumps in a space and decides its state at any unobserved location through interactions with its nearest known neighbors in different directions. The data interaction process can be well explained as a local sequential Bayesian updating process within a neighborhood. Because single-step transition probability matrices are difficult to estimate from sparse sample data and are impractical in representing the complex spatial heterogeneity of states, the transiogram, which is defined as a transition probability function over the distance lag, is proposed as the accompanying spatial measure of Markov chain random fields.Reservoir modeling
In the oil and gas industry, reservoir modeling involves the construction of a computer model of a petroleum reservoir, for the purposes of improving estimation of reserves and making decisions regarding the development of the field, predicting future production, placing additional wells, and evaluating alternative reservoir management scenarios.
A reservoir model represents the physical space of the reservoir by an array of discrete cells, delineated by a grid which may be regular or irregular. The array of cells is usually three-dimensional, although 1D and 2D models are sometimes used. Values for attributes such as porosity, permeability and water saturation are associated with each cell. The value of each attribute is implicitly deemed to apply uniformly throughout the volume of the reservoir represented by the cell.Seismic inversion
Seismic inversion, in geophysics (primarily in oil-and-gas exploration/development), is the process of transforming seismic reflection data into a quantitative rock-property description of a reservoir. Seismic inversion may be pre- or post-stack, deterministic, random or geostatistical; it typically includes other reservoir measurements such as well logs and cores .Semivariance
In spatial statistics, the empirical semivariance is described by semivariance=1/2(z(x)-z(x))^2 where z is the attribute value
where z is a datum at a particular location, h is the distance between ordered data, and n(h) is the number of paired data at a distance of h. The semivariance is half the variance of the increments , but the whole variance of z-values at given separation distance h (Bachmaier and Backes, 2008).
A plot of semivariances versus distances between ordered data in a graph is known as a semivariogram rather than a variogram. Many authors call a variogram, others use the terms variogram and semivariogram synonymously. However, Bachmaier and Backes (2008), who discussed this confusion, have shown that should be called a variogram, terms like semivariogram or semivariance should be avoided.Spatial analysis
Spatial analysis or spatial statistics includes any of the formal techniques which study entities using their topological, geometric, or geographic properties. Spatial analysis includes a variety of techniques, many still in their early development, using different analytic approaches and applied in fields as diverse as astronomy, with its studies of the placement of galaxies in the cosmos, to chip fabrication engineering, with its use of "place and route" algorithms to build complex wiring structures. In a more restricted sense, spatial analysis is the technique applied to structures at the human scale, most notably in the analysis of geographic data.
Complex issues arise in spatial analysis, many of which are neither clearly defined nor completely resolved, but form the basis for current research. The most fundamental of these is the problem of defining the spatial location of the entities being studied.
Classification of the techniques of spatial analysis is difficult because of the large number of different fields of research involved, the different fundamental approaches which can be chosen, and the many forms the data can take.Spatial variability
Spatial variability occurs when a quantity that is measured at different spatial locations exhibits values that differ across the locations. Spatial variability can be assessed using spatial descriptive statistics such as the range.
Let us suppose, that the Rev' z(x) is perfectly known at any point x within the field under study. Then the uncertainty about z(x) is reduced to zero, whereas its spatial variability still exists. Uncertainty is closely related to the amount of spatial variability, but it is also strongly dependent upon sampling.4Geostatistical analyses have been strictly performed to study the spatial variability of pesticide sorption5-7 and degradation8 in the field. Webster and Oliver9 provided a description of geostatistical techniques. Describing uncertainty using geostatistics is not an activity exempt from uncertainty itself as variogram uncertainty may be large10 and spatial interpolation may be undertaken using different techniques.11Transiogram
In soil science, a transiogram is the accompanying spatial correlation measure of simplified Markov chain random field (MCRF) models based on the conditional independence assumption, and an important part of Markov chain geostatistics. It is defined as a transition probability function over the distance lag.
Simply, a transiogram refers to a transition probability-lag diagram. Transiograms include auto-transiograms and cross-transiograms. The former represent the spatial auto-correlation of a single category, and the latter represent the spatial interclass relationships among different categories.
Experimental transiograms can be directly estimated from sparse sample data. Transiogram models, which provide transition probabilities at any lags for Markov chain modeling, can be further acquired through model fitting of experimental transiograms. In general, the transiogram is a spatial correlation measure following the style of a variogram, and includes a set of concepts and a set of methods for obtaining transition probability values from sample data and providing transition probabilities values for simplified MCRF models.Variogram
In spatial statistics the theoretical variogram is a function describing the degree of spatial dependence of a spatial random field or stochastic process .
In the case of a concrete example from the field of gold mining, a variogram will give a measure of how much two samples taken from the mining area will vary in gold percentage depending on the distance between those samples. Samples taken far apart will vary more than samples taken close to each other.