Visual perception is the ability to interpret the surrounding environment using light in the visible spectrum reflected by the objects in the environment. This is different from visual acuity which refers to how clearly a person sees (for example "20/20 vision"). A person can have problem with visual perceptual processing even if he/she has 20/20 vision.
The resulting perception is also known as visual perception, eyesight, sight, or vision (adjectival form: visual, optical, or ocular). The various physiological components involved in vision are referred to collectively as the visual system, and are the focus of much research in linguistics, psychology, cognitive science, neuroscience, and molecular biology, collectively referred to as vision science.
The visual system in animals allows individuals to assimilate information from their surroundings. The act of seeing starts when the cornea and then the lens of the eye focuses light from its surroundings onto a light-sensitive membrane in the back of the eye, called the retina. The retina is actually part of the brain that is isolated to serve as a transducer for the conversion of light into neuronal signals. Based on feedback from the visual system, the lens of the eye adjusts its thickness to focus light on the photoreceptive cells of the retina, also known as the rods and cones, which detect the photons of light and respond by producing neural impulses. These signals are processed via complex feedforward and feedback processes by different parts of the brain, from the retina upstream to central ganglia in the brain.
Note that up until now much of the above paragraph could apply to octopuses, mollusks, worms, insects and things more primitive; anything with a more concentrated nervous system and better eyes than say a jellyfish. However, the following applies to mammals generally and birds (in modified form): The retina in these more complex animals sends fibers (the optic nerve) to the lateral geniculate nucleus, to the primary and secondary visual cortex of the brain. Signals from the retina can also travel directly from the retina to the superior colliculus.
The perception of objects and the totality of the visual scene is accomplished by the visual association cortex. The visual association cortex combines all sensory information perceived by the striate cortex which contains thousands of modules that are part of modular neural networks. The neurons in the striate cortex send axons to the extrastriate cortex, a region in the visual association cortex that surrounds the striate cortex.
The human visual system is generally believed to perceive visible light in the range of wavelengths between 370 and 730 nanometers (0.00000037 to 0.00000073 meters) of the electromagnetic spectrum. However, some research suggests that humans can perceive light in wavelengths down to 340 nanometers (UV-A), especially the young.
The major problem in visual perception is that what people see is not simply a translation of retinal stimuli (i.e., the image on the retina). Thus people interested in perception have long struggled to explain what visual processing does to create what is actually seen.
There were two major ancient Greek schools, providing a primitive explanation of how vision is carried out in the body.
The first was the "emission theory" which maintained that vision occurs when rays emanate from the eyes and are intercepted by visual objects. If an object was seen directly it was by 'means of rays' coming out of the eyes and again falling on the object. A refracted image was, however, seen by 'means of rays' as well, which came out of the eyes, traversed through the air, and after refraction, fell on the visible object which was sighted as the result of the movement of the rays from the eye. This theory was championed by scholars like Euclid and Ptolemy and their followers.
The second school advocated the so-called 'intro-mission' approach which sees vision as coming from something entering the eyes representative of the object. With its main propagators Aristotle, Galen and their followers, this theory seems to have some contact with modern theories of what vision really is, but it remained only a speculation lacking any experimental foundation. (In eighteenth-century England, Isaac Newton, John Locke, and others, carried the intromission/intromittist theory forward by insisting that vision involved a process in which rays—composed of actual corporeal matter—emanated from seen objects and entered the seer's mind/sensorium through the eye's aperture.)
Both schools of thought relied upon the principle that "like is only known by like", and thus upon the notion that the eye was composed of some "internal fire" which interacted with the "external fire" of visible light and made vision possible. Plato makes this assertion in his dialogue Timaeus, as does Aristotle, in his De Sensu.
Alhazen (965 – c. 1040) carried out many investigations and experiments on visual perception, extended the work of Ptolemy on binocular vision, and commented on the anatomical works of Galen. He was the first person to explain that vision occurs when light bounces on an object and then is directed to one's eyes.
Leonardo da Vinci (1452–1519) is believed to be the first to recognize the special optical qualities of the eye. He wrote "The function of the human eye ... was described by a large number of authors in a certain way. But I found it to be completely different." His main experimental finding was that there is only a distinct and clear vision at the line of sight—the optical line that ends at the fovea. Although he did not use these words literally he actually is the father of the modern distinction between foveal and peripheral vision.
Issac Newton (1642–1726/27) was the first to discover through experimentation, by isolating individual colors of the spectrum of light passing through a prism, that the visually perceived color of objects appeared due to the character of light the objects reflected, and that these divided colors could not be changed into any other color, which was contrary to scientific expectation of the day.
Hermann von Helmholtz is often credited with the first study of visual perception in modern times. Helmholtz examined the human eye and concluded that it was, optically, rather poor. The poor-quality information gathered via the eye seemed to him to make vision impossible. He therefore concluded that vision could only be the result of some form of unconscious inferences: a matter of making assumptions and conclusions from incomplete data, based on previous experiences.
Inference requires prior experience of the world.
Examples of well-known assumptions, based on visual experience, are:
The study of visual illusions (cases when the inference process goes wrong) has yielded much insight into what sort of assumptions the visual system makes.
Another type of the unconscious inference hypothesis (based on probabilities) has recently been revived in so-called Bayesian studies of visual perception. Proponents of this approach consider that the visual system performs some form of Bayesian inference to derive a perception from sensory data. However, it is not clear how proponents of this view derive, in principle, the relevant probabilities required by the Bayesian equation. Models based on this idea have been used to describe various visual perceptual functions, such as the perception of motion, the perception of depth, and figure-ground perception. The "wholly empirical theory of perception" is a related and newer approach that rationalizes visual perception without explicitly invoking Bayesian formalisms.
The Gestalt Laws of Organization have guided the study of how people perceive visual components as organized patterns or wholes, instead of many different parts. "Gestalt" is a German word that partially translates to "configuration or pattern" along with "whole or emergent structure". According to this theory, there are eight main factors that determine how the visual system automatically groups elements into patterns: Proximity, Similarity, Closure, Symmetry, Common Fate (i.e. common motion), Continuity as well as Good Gestalt (pattern that is regular, simple, and orderly) and Past Experience.
During the 1960s, technical development permitted the continuous registration of eye movement during reading, in picture viewing, and later, in visual problem solving, and when headset-cameras became available, also during driving.
The picture to the right shows what may happen during the first two seconds of visual inspection. While the background is out of focus, representing the peripheral vision, the first eye movement goes to the boots of the man (just because they are very near the starting fixation and have a reasonable contrast).
The following fixations jump from face to face. They might even permit comparisons between faces.
It may be concluded that the icon face is a very attractive search icon within the peripheral field of vision. The foveal vision adds detailed information to the peripheral first impression.
It can also be noted that there are different types of eye movements: fixational eye movements (microsaccades, ocular drift, and tremor), vergence movements, saccadic movements and pursuit movements. Fixations are comparably static points where the eye rests. However, the eye is never completely still, but gaze position will drift. These drifts are in turn corrected by microsaccades, very small fixational eye-movements. Vergence movements involve the cooperation of both eyes to allow for an image to fall on the same area of both retinas. This results in a single focused image. Saccadic movements is the type of eye movement that makes jumps from one position to another position and is used to rapidly scan a particular scene/image. Lastly, pursuit movement is smooth eye movement and is used to follow objects in motion.
There is considerable evidence that face and object recognition are accomplished by distinct systems. For example, prosopagnosic patients show deficits in face, but not object processing, while object agnosic patients (most notably, patient C.K.) show deficits in object processing with spared face processing. Behaviorally, it has been shown that faces, but not objects, are subject to inversion effects, leading to the claim that faces are "special". Further, face and object processing recruit distinct neural systems. Notably, some have argued that the apparent specialization of the human brain for face processing does not reflect true domain specificity, but rather a more general process of expert-level discrimination within a given class of stimulus, though this latter claim is the subject of substantial debate. Using fMRI and electrophysiology Doris Tsao and colleagues described brain regions and a mechanism for face recognition in macaque monkeys.
In the 1970s, David Marr developed a multi-level theory of vision, which analyzed the process of vision at different levels of abstraction. In order to focus on the understanding of specific problems in vision, he identified three levels of analysis: the computational, algorithmic and implementational levels. Many vision scientists, including Tomaso Poggio, have embraced these levels of analysis and employed them to further characterize vision from a computational perspective.
The computational level addresses, at a high level of abstraction, the problems that the visual system must overcome. The algorithmic level attempts to identify the strategy that may be used to solve these problems. Finally, the implementational level attempts to explain how solutions to these problems are realized in neural circuitry.
Marr suggested that it is possible to investigate vision at any of these levels independently. Marr described vision as proceeding from a two-dimensional visual array (on the retina) to a three-dimensional description of the world as output. His stages of vision include:
Marr's 2.5D sketch assumes that a depth map is constructed, and that this map is the basis of 3D shape perception. However, both stereoscopic and pictorial perception, as well as monocular viewing, make clear that the perception of 3D shape precedes, and does not rely on, the perception of the depth of points. It is not clear how a preliminary depth map could, in principle, be constructed, nor how this would address the question of figure-ground organization, or grouping. The role of perceptual organizing constraints, overlooked by Marr, in the production of 3D shape percepts from binocularly-viewed 3D objects has been demonstrated empirically for the case of 3D wire objects, e.g. For a more detailed discussion, see Pizlo (2008).
Transduction is the process through which energy from environmental stimuli is converted to neural activity for the brain to understand and process. The back of the eye contains three different cell layers: photoreceptor layer, bipolar cell layer and ganglion cell layer. The photoreceptor layer is at the very back and contains rod photoreceptors and cone photoreceptors. Cones are responsible for color perception. There are three different cones: red, green and blue. Rods, are responsible for the perception of objects in low light. Photoreceptors contain within them a special chemical called a photopigment, which are embedded in the membrane of the lamellae; a single human rod contains approximately 10 million of them. The photopigment molecules consist of two parts: an opsin (a protein) and retinal (a lipid). There are 3 specific photopigments (each with their own color) that respond to specific wavelengths of light. When the appropriate wavelength of light hits the photoreceptor, its photopigment splits into two, which sends a message to the bipolar cell layer, which in turn sends a message to the ganglion cells, which then send the information through the optic nerve to the brain. If the appropriate photopigment is not in the proper photoreceptor (for example, a green photopigment inside a red cone), a condition called color vision deficiency will occur.
Transduction involves chemical messages sent from the photoreceptors to the bipolar cells to the ganglion cells. Several photoreceptors may send their information to one ganglion cell. There are two types of ganglion cells: red/green and yellow/blue. These neuron cells constantly fire—even when not stimulated. The brain interprets different colors (and with a lot of information, an image) when the rate of firing of these neurons alters. Red light stimulates the red cone, which in turn stimulates the red/green ganglion cell. Likewise, green light stimulates the green cone, which stimulates the red/green ganglion cell and blue light stimulates the blue cone which stimulates the yellow/blue ganglion cell. The rate of firing of the ganglion cells is increased when it is signaled by one cone and decreased (inhibited) when it is signaled by the other cone. The first color in the name of the ganglion cell is the color that excites it and the second is the color that inhibits it. i.e.: A red cone would excite the red/green ganglion cell and the green cone would inhibit the red/green ganglion cell. This is an opponent process. If the rate of firing of a red/green ganglion cell is increased, the brain would know that the light was red, if the rate was decreased, the brain would know that the color of the light was green.
Theories and observations of visual perception have been the main source of inspiration for computer vision (also called machine vision, or computational vision). Special hardware structures and software algorithms provide machines with the capability to interpret the images coming from a camera or a sensor. Artificial Visual Perception has long been used in the industry and is now entering the domains of automotive and robotics.
Aerial perspective or atmospheric perspective refers to the effect the atmosphere has on the appearance of an object as it is viewed from a distance. As the distance between an object and a viewer increases, the contrast between the object and its background decreases, and the contrast of any markings or details within the object also decreases. The colours of the object also become less saturated and shift towards the background colour, which is usually blue, but under some conditions may be some other colour (for example, at sunrise or sunset distant colours may shift towards red).Bezold effect
The Bezold effect is an optical illusion, named after a German professor of meteorology, Wilhelm von Bezold (1837–1907), who discovered that a color may appear different depending on its relation to adjacent colors.
It happens when small areas of color are interspersed. An assimilation effect called the von Bezold spreading effect, similar to spatial color mixing, is achieved.
The opposite effect is observed when large areas of color are placed adjacent to each other, resulting in color contrast.Binocular rivalry
Binocular rivalry is a phenomenon of visual perception in which perception alternates between different images presented to each eye.
When one image is presented to one eye and a very different image is presented to the other (also known as dichoptic presentation), instead of the two images being seen superimposed, one image is seen for a few moments, then the other, then the first, and so on, randomly for as long as one cares to look. For example, if a set of vertical lines is presented to one eye, and a set of horizontal lines to the same region of the retina of the other, sometimes the vertical lines are seen with no trace of the horizontal lines, and sometimes the horizontal lines are seen with no trace of the vertical lines.
At transitions, brief, unstable composites of the two images may be seen. For example, the vertical lines may appear one at a time to obscure the horizontal lines from the left or from the right, like a traveling wave, switching slowly one image for the other. Binocular rivalry occurs between any stimuli that differ sufficiently, including simple stimuli like lines of different orientation and complex stimuli like different alphabetic letters or different pictures such as of a face and of a house.
Very small differences between images, however, might yield singleness of vision and stereopsis. Binocular rivalry has been extensively studied in the last century. In recent years neuroscientists have used neuroimaging techniques and single-cell recording techniques to identify neural events responsible for the perceptual dominance of a given image and for the perceptual alternations.Emission theory (vision)
Emission theory or extramission theory (variants: extromission, extromittism) is the proposal that visual perception is accomplished by eye beams emitted by the eyes. This theory has been replaced by intromission theory, which states that visual perception comes from something representative of the object (later established to be rays of light reflected from it) entering the eyes. Modern physics has confirmed that light is physically transmitted by photons from a light source, such as the sun, to visible objects, and finishing with the detector, such as a human eye or camera.Image
An image (from Latin: imago) is an artifact that depicts visual perception, such as a photograph or other two-dimensional picture, that resembles a subject—usually a physical object—and thus provides a depiction of it. In the context of signal processing, an image is a distributed amplitude of color(s).Image compression
Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior results compared with generic data compression methods which are used for other digital data.James J. Gibson
James Jerome Gibson (; January 27, 1904 – December 11, 1979), was an American psychologist and one of the most important contributors to the field of visual perception. Gibson challenged the idea that the nervous system actively constructs conscious visual perception, and instead promoted ecological psychology, in which the mind directly perceives environmental stimuli without additional cognitive construction or processing. A Review of General Psychology survey, published in 2002, ranked him as the 88th most cited psychologist of the 20th century, tied with John Garcia, David Rumelhart, Louis Leon Thurstone, Margaret Floy Washburn, and Robert S. Woodworth.Kinetic depth effect
In visual perception, the kinetic depth effect refers to the phenomenon whereby the three-dimensional structural form of an object can be perceived when the object is moving. In the absence of other visual depth cues, this might be the only perception mechanism available to infer the object's shape. Being able to identify a structure from a motion stimulus through the human visual system was shown by Wallach and O'Connell in the 1950s through their experiments.For example, if a shadow is cast onto a screen by a rotating wire shape, a viewer can readily perceive the shape of the structure behind the screen from the motion and deformation of the shadow.
There are two propositions as to how three-dimensional images are perceived. The experience of three-dimensional images can be caused by differences in the pattern of stimulation on the retina, in comparison to two-dimensional images. Gestalt psychologists hold the view that rules of organization must exist in accordance to the retinal projections of three-dimensional forms which happen to form three-dimensional percepts. Most retinal images of two-dimensional forms lead to two-dimensional forms in experience as well. The other deduction is related to previous experience. However, this assumption does not explain how past experience influences perception of images.In order to model the calculation of depth values from relative movement, many efforts have been made to infer these values using other information like geometry and measurements of objects and their positions. This is related to the extraction of structure from motion in computer vision. In addition, an individual's ability to realize the kinetic depth effect conclusively shows that the visual system can independently figure the structure from motion problem.As with other depth cues, the kinetic depth effect is almost always produced in combination with other effects, most notably the motion parallax effect. For instance, the rotating circles illusion and the rotating dots visualization (which is similar in principle to the projected wireframe demonstration mentioned above) rely strongly on the previous knowledge that objects (or parts thereof) further from the observer appear to move more slowly than those that are closer.
The kinetic depth effect can manifest independently, however, even when motion parallax is not present. An example of such a situation is the art installment "The Analysis of Beauty", by the Disinformation project, created as a tribute to William Hogarth's concept of the Serpentine Line (which was presented in his homonymous book).Parafovea
Parafovea or the parafoveal belt is a region in the retina that circumscribes the fovea and is part of the macula lutea. It is circumscribed by the perifovea.Pareidolia
Pareidolia ( parr-i-DOH-lee-ə) is the tendency to interpret a vague stimulus as something known to the observer, such as seeing shapes in clouds, seeing faces in inanimate objects or abstract patterns, or hearing hidden messages in music.
Common examples are perceived images of animals, faces, or objects in cloud formations, the Man in the Moon, the Moon rabbit, hidden messages in recorded music played in reverse or at higher- or lower-than-normal speeds, and hearing indistinct voices in random noise such as that produced by air conditioners or fans.A notable example of pareidolia occurred in 1877, when photos taken through a telescope of the surface of Mars that turned up what looked faintly like straight lines, which were then interpreted by some as canals. It was theorized that the canals were possibly created by sentient beings. This created a sensation. In the next few years better photographic techniques and stronger telescopes were developed and applied, which resulted in new images in which the faint lines disappeared, and the canal theory was debunked as an example of pareidolia.Perifovea
Perifovea is a region in the retina that circumscribes the parafovea and fovea and is a part of the macula lutea. The perifovea is a belt that covers a 10° radius around the fovea and is 1.5 mm wide. The perifovea ends when the Henle's fiber layer disappears and the ganglion cells are one-layered.Peripheral vision
Peripheral vision, or indirect vision, is vision as it occurs outside the point of fixation, i.e. away form the center of gaze. The vast majority of the area in the visual field is included in the notion of peripheral vision. "Far peripheral" vision refers to the area at the edges of the visual field, "mid-peripheral" vision refers to medium eccentricities, and "near-peripheral", sometimes referred to as "para-central" vision, exists adjacent to the center of gaze..Persistence of vision
Persistence of vision traditionally refers to the optical illusion that occurs when visual perception of an object does not cease for some time after the rays of light proceeding from it have ceased to enter the eye.
The illusion has also been described as "retinal persistence", "persistence of impressions", simply "persistence" and other variations. According to this definition, the illusion would be the same as, or very similar to positive afterimages."Persistence of vision" can also be understood to mean the same as "flicker fusion", the effect that vision seems to persist continuously when a stream of light is repeatedly interrupted for very brief instances and thus enters the eyes intermittently.
Since its introduction, the term "persistence of vision" has been believed to be the explanation for motion perception in optical toys like the phenakistiscope and the zoetrope, and later in cinema. However, this theory has been disputed even before cinema was introduced in 1895.
If "persistence of vision" is explained as "flicker fusion", it can be seen as an important factor in the illusion of moving pictures in cinema and related optical toys, but not as its sole principle.
Early descriptions of the illusion often attributed the effect purely to imperfections of the eye, particularly of the retina. Nerves and parts of the brain later became part of explanations.
Sensory memory has been cited as a cause.Retinal migraine
Retinal migraine (also known as ophthalmic migraine, and ocular migraine) is a retinal disease often accompanied by migraine headache and typically affects only one eye. It is caused by ischaemia or vascular spasm in or behind the affected eye.
The terms "retinal migraine" and "ocular migraine" are often confused with "visual migraine", which is a far-more-common symptom of vision loss, resulting from the aura phase of the common migraine. The aura phase of migraine can occur with or without a headache. Ocular or retinal migraines happen in the eye, so only affect the vision in that eye, while visual migraines occur in the brain, so affect the vision in both eyes together. Visual migraines result from cortical spreading depression and are also commonly termed scintillating scotoma.Scintillating scotoma
Scintillating scotoma, also called visual migraine, is a common visual aura preceding migraine and was first described by 19th-century physician Hubert Airy (1838–1903). It may precede a migraine headache, but can also occur acephalgically (without headache). It is often confused with retinal migraine, which originates in the eyeball or socket.See
See or SEE may refer to:
See, to engage in visual perceptionVision science
Vision science is the scientific study of vision. Vision science encompasses all studies of vision, such as how human and non-human organisms process visual information, how conscious visual perception works in humans, how to exploit visual perception for effective communication, and how artificial systems can do the same tasks. Vision science overlaps with or encompasses disciplines such as ophthalmology and optometry, neuroscience(s), psychology (particularly sensation and perception psychology, cognitive psychology, linguistics, biopsychology, psychophysics, and neuropsychology), physics (particularly optics), ethology, and computer science (particularly computer vision, artificial intelligence, and computer graphics), as well as other engineering related areas such as data visualization, user interface design, and human factors and ergonomics. Below is a list of pertinent journals.Visual technology
Visual technology is the engineering discipline dealing with visual representation.Voyeurism
Voyeurism is the sexual interest in or practice of spying on people engaged in intimate behaviors, such as undressing, sexual activity, or other actions usually considered to be of a private nature.The term comes from the French voir which means "to see". A male voyeur is commonly labelled as "Peeping Tom" or a "Jags", a term which originates from the Lady Godiva legend. However, that term is usually applied to a male who observes somebody secretly and, generally, not in a public space.
The sensory system
|Touch and position|
Vision in animals
Visualization of technical information