Center for Neurobiology of Vision
858.453.4100 ext 1014 | office
Sergei is a vision scientist interested in foundations of perceptual psychology and sensory neuroscience. His current work is concentrated on the interface between two levels of visual perception — the entry process called early vision and the next stage called perceptual organization:
Sergei studies computational principles and biological mechanisms underlying these processes, in particular how visual information is organized for perception of motion and change.
Sergei also studies sensorimotor integration: the connection of perception and action. Currently he tries to understand how vision helps us to plan actions prospectively, for several steps ahead, again considering the dynamic nature of the environment, its risks and uncertainties.
As a staff scientist and a principal investigator at the Salk Institute for Biological Studies, Sergei uses experimental and computational methods to characterize neuronal mechanisms of sensation, perception, and action.
Double-click the blue markers [+] for further detail.
Gepshtein S, Lesmes LA & Albright TD (2013). Sensory adaptation as optimal resource allocation. Proceedings of the National Academy of Sciences, USA 110 (11), 4368-4373.
Visual adaptation is expected to improve visual performance in the new environment. The expectation has been contradicted by evidence that adaptation sometimes decreases sensitivity for the adapting stimuli, and sometimes it changes sensitivity for stimuli very different from the adapting ones. We hypothesize that this pattern of results can be explained by a process that optimizes sensitivity for many stimuli, rather than changing sensitivity only for those stimuli whose statistics have changed. To test this hypothesis, we measured visual sensitivity across a broad range of spatiotemporal modulations of luminance, while varying the distribution of stimulus speeds. The manipulation of stimulus statistics caused a large-scale reorganization of visual sensitivity, forming the orderly pattern of sensitivity gains and losses. This pattern is predicted by a theory of distribution of receptive field characteristics in the visual system.
Jurica P, Gepshtein S, Tyukin I & van Leeuwen C (2013). Sensory optimization by stochastic tuning. Psychological Review 120 (4), 798-816. doi: 10.1037/a0034192.
Individually, visual neurons are each selective for several aspects of stimulation, such as stimulus location, frequency content, and speed. Collectively, the neurons implement the visual system's preferential sensitivity to some stimuli over others, manifested in behavioral sensitivity functions. We ask how the individual neurons are coordinated to optimize visual sensitivity. We model synaptic plasticity in a generic neural circuit, and find that stochastic changes in strengths of synaptic connections entail fluctuations in parameters of neural receptive fields. The fluctuations correlate with uncertainty of sensory measurement in individual neurons: the higher the uncertainty the larger the amplitude of fluctuation. We show that this simple relationship is sufficient for the stochastic fluctuations to steer sensitivities of neurons toward a characteristic distribution, from which follows a sensitivity function observed in human psychophysics, and which is predicted by a theory of optimal allocation of receptive fields. The optimal allocation arises in our simulations without supervision or feedback about system performance and independently of coupling between neurons, making the system highly adaptive and sensitive to prevailing stimulation.
Gepshtein S, Li X, Snider J, Plank M, Lee D & Poizner H (in press). Dopamine function and the efficiency of human movement. Journal of Cognitive Neuroscience
To sustain successful behavior in dynamic environments, active organisms must be able to learn from the consequences of their actions and predict action outcomes. One of the most important discoveries in systems neuroscience over the last 15 years has been about the key role of the neurotransmitter dopamine in mediating such active behavior. Dopamine cell firing was found to encode differences between the expected and obtained outcomes of actions. Although activity of dopamine cells does not specify movements themselves, a recent study in humans has suggested that tonic levels of dopamine in the dorsal striatum may in part enable normal movement by encoding sensitivity to the energy cost of a movement, providing an implicit "motor motivational" signal for movement. We investigated the motivational hypothesis of dopamine by studying motor performance of Parkinson's disease (PD) patients who have marked dopamine depletion in the dorsal striatum, and compared their performance with that of elderly healthy adults. All subjects performed rapid sequential movements to visual targets associated with different risk and different energy costs, countered or assisted by gravity. In conditions of low energy cost, patients performed surprisingly well, similar to prescriptions of an ideal planner and healthy subjects. As energy costs increased, however, performance of PD patients dropped markedly below the prescriptions for action by an ideal planner, and below performance of healthy elderly subjects. The results indicate that the ability for efficient planning depends on the energy cost of action and that the effect of energy cost on action is mediated by dopamine.
Alexander DM, Jurica P, Trengove C, Nikolaev AR, Gepshtein S, [...] van Leeuwen C (2013). Traveling waves and trial averaging: the nature of single-trial and averaged brain responses in large-scale cortical signals. NeuroImage 73, 95-112.
Analyzing single trial brain activity remains a challenging problem in the neurosciences. We gain purchase on this problem by focusing on globally synchronous fields in within-trial evoked brain activity, rather than on localized peaks in the trial-averaged evoked response (ER). We analyzed data from three measurement modalities, each with different spatial resolution: magnetoencephalogram (MEG), electroencephalogram (EEG) and electrocorticogram (ECoG). We first characterized the ER in terms of summation of phase and amplitude components over trials. Both contributed to the ER, as expected, but the ER topography was dominated by the phase component. This means the ER topography is akin to an interference pattern in phase across trials. Hence the observed topography of cross-trial phase will not accurately reflect the phase topography within trials. To assess the organization of within-trial phase, traveling wave (TW) components were quantified by computing the phase gradient. TWs were intermittent but ubiquitous in the within-trial evoked brain activity. At most task-relevant times and frequencies, the within-trial phase topography was described better by a TW than by the trial-average of phase. The trial-average of the TW components also reproduced the topography of the ER; we suggest that the ER topography arises, in large part, as an average over TW behaviors. These findings were consistent across the three measurement modalities. We conclude that, while phase is critical to understanding the topography of event-related activity, the preliminary step of collating cortical signals across trials can obscure the TW components in brain activity and lead to an underestimation of the coherent motion of cortical fields.
Kubovy M, Epstein W & Gepshtein S (2013). Foundations of visual perception. In Healy AF & Proctor RW (Eds.) Experimental Psychology. Volume 4 in Weiner IB (Editor-in-Chief) Handbook of Psychology, 2d ed. John Wiley & Sons, New York, USA, 85-119.
This chapter contains three tutorial overviews of theoretical and methodological ideas that are important to students of visual perception. From the vast scope of the material we could have covered, we have chosen a small set of topics that form the foundations of vision research. To help fill the inevitable gaps, we have provided pointers to the literature, giving preference to works written at a level accessible to a beginning graduate student. First, we provide a sketch of the theoretical foundations of our field. We lay out four major research programs (in the past they might have been called "schools"), and then discuss how they address eight foundational questions that promise to occupy our discipline for many years to come. Second, we discuss psychophysics, which offers indispensable tools for the researcher. Here we lead the reader from the idea of threshold to the tools of signal detection theory. To illustrate our presentation of methodology, we have not focused on the classics that appear in much of the secondary literature. Rather, we have chosen recent research that showcases the current practice in the field, and the applicability of these methods to a wide range of problems. The contemporary view of perception maintains that perceptual theory requires an understanding of our environment as well as the perceiver. That is why, in the third section, we ask what are the regularities of the environment, how may they be discovered, and to what extent do perceivers use them. Here, too, we use recent research to exemplify this approach.
Nikolaev AR, Gepshtein S & van Leeuwen C (2013). Spontaneous EEG activity and biases in perception of supra-threshold stimuli. In Yamaguchi Y (Ed.) Advances in Cognitive Neurodynamics III, Springer Science & Business Media, Dordrecht, pp 289-295. DOI 10.1007/978-94-007-4792-0 39.
Human perception of [ambiguous] visual stimuli is biased: some orientations are seen more often than others. We studied how the orientation bias is represented in the electrical brain activity that preceded presentation of ambiguous supra-threshold visual stimuli. We examined scalp EEG over the parieto-occipital regions during 1 sec before stimulus presentation. The alpha activity of pre-stimulus EEG was associated with the orientation bias: the preference for vertical orientation in most observers corresponded to low pre-stimulus alpha power. The results indicate that the orientation bias is encoded in intrinsic properties of ongoing cortical dynamics, forming spontaneous orientation-specific patterns of activity.
Vidal-Naquet M & Gepshtein S (2012). Spatially invariant computations in stereoscopic vision. Frontiers in Computational Neuroscience 6 (47), doi: 10.3389/fncom.2012.00047.
Perception of stereoscopic depth requires that visual systems solve a correspondence problem: find parts of the left-eye view of the visual scene that correspond to parts of the right-eye view. The standard model of binocular matching implies that similarity of left and right images is computed by inter-ocular correlation. But the left and right images of the same object are normally distorted relative to one another by the binocular projection, in particular when slanted surfaces are viewed from close distance. Correlation often fails to detect correct correspondences between such image parts. We investigate a measure of inter-ocular similarity that takes advantage of spatially invariant computations similar to the computations performed by complex cells in biological visual systems. This measure tolerates distortions of corresponding image parts and yields excellent performance over a much larger range of surface slants than the standard model. The results suggest that, rather than serving as disparity detectors, multiple binocular complex cells take part in the computation of inter-ocular similarity, and that visual systems are likely to postpone commitment to particular binocular disparities until later stages in the visual process.
Wagemans J, Feldman J, Gepshtein S, Kimchi R, Pomerantz JR, van der Helm PA & van Leeuwen C (2012). A century of Gestalt psychology in visual perception: II. Conceptual and theoretical foundations. Psychological Bulletin 138 (6), 1218-1252.
[Further progress in studues of perceptual grouping and figure-ground organization requires a reconsideration of the conceptual and theoretical foundations of the Gestalt approach. Here, we] review contemporary formulations of [perceptual organization] within an information-processing framework, allowing for operational definitions [...] and a refined understanding of its psychological implications [...]. We also review four lines of theoretical progress regarding the law of Praegnanz — the brain's tendency of being attracted towards states corresponding to the simplest possible organization, given the available stimulation. The first considers the brain as a complex adaptive system and explains how self-organization solves the conundrum of trading between robustness and flexibility of perceptual states. The second specifies the economy principle in terms of optimization of neural resources, showing that elementary sensors working independently to minimize uncertainty can respond optimally at the system level. The third considers how Gestalt percepts (e.g., groups, objects) are optimal given the available stimulation, with optimality specified in Bayesian terms. Fourth, Structural Information Theory explains how a Gestaltist visual system that focuses on internal coding efficiency yields external veridicality as a side-effect. To answer the fundamental question of why things look as they do, a further synthesis of these complementary perspectives is required.
Plomp G, van Leeuwen C & Gepshtein S (2012). Perception of time in articulated visual events. Frontiers in Psychology 3 (564), doi: 10.3389/fpsyg.2012.00564.
Perceived duration of a sensory event often exceeds its actual duration. This phenomenon is called time dilation. The distortion may occur because sensory systems are optimized for perception within their respective modalities and not for perception of time. We investigated how the dilation of visual events depends on the duration and content of events. Observers compared the durations of two successive visual stimuli while the luminance of one of the stimuli was modulated at different temporal frequencies. Time dilation correlated with the frequency of modulation and the duration of the stimulus: the faster the modulation and the longer the stimulus duration, the larger the dilation. Notably, time dilation was also accompanied by a decreased sensitivity to stimulus duration. We show that these results are consistent with the notion that stimulus duration is estimated using measurement intervals of the lengths that depend on stimulus frequency content. Estimation of temporal frequency content is more precise using longer measurement intervals, whereas estimation of temporal location is more precise using shorter ones. As a result, visual perception will benefit from using longer intervals when the stimulus is modulated so that its frequency content is measured more precisely. A side effect of using longer temporal intervals is a larger uncertainty about the timing of stimulus offset (temporal location), ensuing time dilation and the reduction of sensitivity to duration. Our findings support the view that time dilation follows from basic principles of measurement and from the notion that visual systems are optimized for visual perception rather than for perception of time.
Nikolaev AR, Gepshtein S, Gong P & van Leeuwen C (2010). Duration of coherence intervals in electrical brain activity in perceptual organization. Cerebral Cortex 20 (2), 365-382.
Gepshtein S (2010). Two psychologies of perception and the prospect of their synthesis. Philosophical Psychology 23 (2), 217-281.
Two traditions have had a great impact on the theoretical and experimental research of perception. One tradition is statistical, stretching from Fechner's enunciation of psychophysics in 1860 to the modern view of perception as statistical decision making. The other tradition is phenomenological, from Brentano's "empirical standpoint" of 1874 to the Gestalt movement and the modern work on perceptual organization. Each tradition has at its core a distinctive assumption about the indivisible constituents of perception: the just-noticeable differences of sensation in the tradition of Fechner vs. the phenomenological Gestalts in the tradition of Brentano. But some key results from the two traditions can be explained and connected using an approach that is neither statistical nor phenomenological. This approach rests on a basic property of any information exchange: a principle of measurement formulated in 1946 by Gabor as a part of his quantal theory of information. Here the indivisible components are units (quanta) of information that remain invariant under changes of precision of measurement. This approach helped to understand how sensory measurements are implemented by single neural cells. But recent analyses suggest that this approach has the power to explain larger-scale characteristics of sensory systems.
Nikolaev AR, Gepshtein S, Kubovy M & van Leeuwen C (2008). Dissociation of early evoked cortical activity in perceptual grouping. Experimental Brain Research 186 (1), 107-122.
Perceptual grouping is a multi-stage process, irreducible to a single mechanism localized anatomically or chronometrically. To understand how various grouping mechanisms interact, we combined a phenomenological report paradigm with high-density event-related potential (ERP) measurements, using a 256-channel electrode array. We varied the relative salience of competing perceptual organizations in multi-stable dot lattices and asked observers to report perceived groupings. The ability to discriminate groupings (the grouping sensitivity) was positively correlated with the amplitude of the earliest ERP peak C1 (about 60 ms after stimulus onset) over the middle occipital area. This early activity is believed to reflect spontaneous feed-forward processes preceding perceptual awareness. Grouping sensitivity was negatively correlated with the amplitude of the next peak P1 (about 110 ms), which is believed to reflect lateral and feedback interactions associated with perceptual awareness and attention. This dissociation between C1 and P1 activity implies that the recruitment of fast, spontaneous mechanisms for grouping leads to high grouping sensitivity. Observers who fail to recruit these mechanisms are trying to compensate by using later mechanisms, which depend less on stimulus properties such as proximity.
Gepshtein S (2008). Closing the gap between ideal and real behavior: Scientific vs. engineering approaches to normativity. Philosophical Psychology 22 (1), 61-75.
Early normative studies of human behavior revealed a gap between the norms of practical rationality (what humans ought to do) and the actual human behavior (what they do). It has been suggested that, to close the gap between the descriptive and the normative, one has to revise norms of practical rationality according to the Quinean, engineering view of normativity. On this view, the norms must be designed such that they effectively account for behavior. I review recent studies of human perception which pursued normative modeling and which found good agreement between the normative prescriptions and the actual behavior. I make the case that the goals and methods of this work have been incompatible with those of the engineering approach. I argue that norms of perception and action are observer-independent properties of biological agents; the norms are discovered using methods of the natural science rather than the norms are designed to fit the observed behavior.
Gepshtein S & Kubovy M (2007). The lawful perception of apparent motion. Journal of Vision 7 (8):9, 1-15.
Gepshtein S, Tyukin I & Kubovy M (2007). The economics of motion perception and invariants of visual sensitivity. Journal of Vision 7 (8):8, 1-18.
Gepshtein S, Seydell A & Trommershäuser J (2007). Optimality of human movement under natural variations of visual-motor uncertainty. Journal of Vision 7 (5):13, 1-18. [Supplementary Materials]
Trommershäuser J, Gepshtein S, Maloney LT, Landy MS & Banks MS (2005). Optimal compensation for changes in task relevant movement variability. Journal of Neuroscience 25 (31), 7169-7178.
Gepshtein S, Burge J, Ernst M & Banks MS (2005). The combination of vision and touch depends on spatial proximity. Journal of Vision 5 (11):7, 1013-1023.
Gepshtein S & Kubovy M (2005). Stability and change in perception: Spatial organization in temporal context. Experimental Brain Research 160 (4), 487-495.
Reviewed in Bruno N (2005). Unifying sequential effects in perceptual grouping TRENDS in Cognitive Sciences 9 (1), 1-3 [local copy]
Banks MS, Gepshtein S & Landy MS (2004). Why is spatial stereoresolution so low? Journal of Neuroscience 24 (9), 2077-2089.
Vision and haptics have different limitations and advantages because they obtain information by different methods. If the brain combined information from the two senses optimally, it would rely more on the one providing more precise information for the current task. In this study, human observers judged the distance between two parallel surfaces in two within-modality experiments (vision-alone and haptics-alone) and in an intermodality experiment (vision and haptics together). We find that the combined size estimates are finer than it is possible with either vision or haptics alone. Indeed, the combined estimates approach statistical optimality. [Gepshtein S & Banks MS (2003). Viewing geometry determines how vision and touch combine in size perception. Current Biology 13 (6), 483-488.]
Kubovy M & Gepshtein S (2003). Grouping in Space and in Space-Time: An Exercise in Phenomenological Psychophysics. In: Behrmann M, Kimchi R & Olson C (Eds.) Perceptual Organization in Vision: Behavioral and Neural Perspectives. Lawrence Erlbaum Association, Mahwah, NJ, 45-85.
We show that grouping by proximity can be modeled with a simple model that has few of the characteristics that one might expect of a Gestalt phenomenon.
We do phenomenological psychophysics. Because the observers' responses are based on phenomenal experiences, which are still in bad repute among psychologists, we conclude with an explication of the roots of such skeptical views, and show that they have limited validity.
It is is natural to think that in perceiving dynamic scenes, vision takes a series of snapshots. Motion perception can ensue when the snapshots are different. The snapshot metaphor suggests two questions: (i) How does the visual system put together elements within each snapshot to form objects? This is the spatial grouping problem. (ii) When the snapshots are different, how does the visual system know which element in one snapshot corresponds to which element in the next? This is the temporal grouping problem. The snapshot metaphor is a caricature of the dominant model in the field (the sequential model) according to which spatial and temporal grouping are independent. The model we propose here is an interactive model, according to which the two grouping mechanisms are not separable. [Gepshtein S & Kubovy M (2000). The emergence of visual objects in space-time. Proceedings of the National Academy of Sciences, USA 97 (14), 8186-8191.]
Double-click the blue markers [+] for further detail.[+] Prospective optimization under risk
We studied how humans optimize action over multiple future steps in dynamic risky environments. We measured how rapidly healthy adult subjects could recompute the future course of action as new information gradually entered the scope of foreseeable action.
We found that the scope of the future over which our subjects computed future actions was flexible. The scope of computation increased as the task difficulty decreased. But this flexibility had a cost: the larger the scope of computation the lower the ability to use immediate information. This is while the subjects used all the available information: our analyses showed that the subjects did not use such heuristics as only seeking the large-gain steps or only avoiding the small-gain steps. Instead, our findings revealed a sophisticated strategy of prospective optimization that allocates the limited computational resources such as to take advantage of all the information at hand and to balance the immediate and delayed rewards.
Lee D, Snider J, Poizner H and Gepshtein S | early report: SfN 2012
Visual perception is adaptive: it depends on the previous visual stimulation. The adaptive change is mediated by synaptic plasticity of individual neural cells whose behavior is stochastic. Here we show how the stochastic activity of individual cells leads to stochastic updating of their tuning, and how these changes are sufficient to explain some previously puzzling results from behavioral and physiological studies of visual perception
Gepshtein S, Jurica P, Tyukin I, van Leeuwen C and Albright TD | early report: SfN 2012
The spatiotemporal contrast sensitivity function (the 'Kelly function') provides a broad summary of human visual sensitivity used in basic and clinical studies of vision. We ask which features of the Kelly function are invariant across tasks and measurement procedures. Knowing which aspects of sensitivity are invariant facilitates the comprehensive assessment of contrast sensitivity and the changes of sensitivity caused by adaptation or disease.
We isolate those aspects of the Kelly function that do not vary across observers, tasks, and experimental procedures. This allows us to advance specific prescriptions for efficient estimation of sensitivity. In particular, we propose that the method of estimation ought to incorporate measurement of the width (or the slope) of the underlying psychometric functions, or no assumption should be made about the width.
Laddis PA, Lesmes LA, Gepshtein S and Albright TD
The spatiotemporal contrast sensitivity function describes visual sensitivity to moving or flickering gratings across the entire range of visible spatial and temporal frequencies of luminance modulation. In spite of its value for assessment of spatiotemporal vision, the long testing time required for assaying the entire function has often forced researchers to confine measurements to representative sections of spatiotemporal sensitivity: spatial, sampled at a fixed temporal frequency, or temporal, sampled at a fixed spatial frequency. Here we present a novel adaptive method that accelerates the measurement by using Bayesian adaptive inference from the information gained from multiple sections of the spatiotemporal function. The new procedure evaluates the expected gain of information about parameters of sensitivity within every section and selects the stimulus that maximizes the expected gain across several sections. We validated the new procedure in computational and psychophysical experiments. In a direction discrimination task, we used drifting grating stimuli that spanned a broad range of spatial (0.5-8 cycles/deg) and temporal frequencies (0.25-24 Hz) of luminance modulation. Within 300-500 trials (15-25 minutes of run time) the new procedure provided estimates of sensitivity at the accuracy of 10% and the precision of 0.2-0.3 decimal log units.
Lesmes LA, Gepshtein S, Lu Z-L and Albright TD
We study the role of spontaneous cortical activity in perceptual learning. We find that the pre-stimulus cortical activity in the alpha band reflects a process that helps to disambiguate perception.
We measured the electrical brain activity preceding ambiguous visual stimuli: dot lattices, in which the dots are seen to group along one or several orientations depending on dot proximity. Perceptual reports on every trial depended on two factors: participants' sensitivity to dot proximity and their intrinsic bias for the orientation of perceptual grouping. The effect of intrinsic bias changed during the experiment. As participants learned the task, the initially prominent role of intrinsic bias decreased and sensitivity to dot proximity increased, giving way to by the well-known association between pre-stimulus alpha phase and visual sensitivity.
For as long as the role of intrinsic bias was prominent, we observed an intermittent regime of alpha activity, in which a mode of low amplitude and low temporal variability alternated with a mode of high amplitude and high temporal variability. The latter mode was associated with the unbiased responses whereas the former mode was associated with the intrinsic orientation bias. We propose that the intermittent alpha activity is a mechanism that helps to resolve perceptual ambiguity, mediating flexible application of internal representations, thus compensating for a lack of stimulus information.
Nikolaev AN, Gepshtein S and van Leeuwen C | early report: ECVP 2012
The symposium organized by Sergei Gepshtein (Salk Institute) and Alex McDowell (USC) celebrated the rapidly growing interaction between two communities: researchers engaged in the scientific study of human perception and action and the practitioners of interactive and immersive narrative media technologies. Leading scientists and artists discussed human behavior and conscious experience in face of physical, social, and imagined realities represented in purely virtual worlds, as well as in the 'mixed' worlds that interlace the physical and virtual realities. The symposium consisted of a series of sessions each featuring two speakers: a scientist and an artist or immersive-reality practitioner. The speakers first presented their approaches and then reviewed the existing and prospective links between their domains of expertise. Each session was followed by an extensive discussion. An on-line publication featuring footage from the event is forthcoming.
Gepshtein S and McDowell A | preview at the 5D Institute
Until very recently, research on perceptual organization has been primarily descriptive. The result was a taxonomy of phenomena with little attempt to identify underlying mechanisms or develop predictive models. The situation has changed in recent years. New experimental methods have been introduced to measure the organizational processes in vision and other sensory modalities, and new predictive computational theories have been developed. This Handbook is an organized survey of the many new approaches to the study of perceptual (mainly visual) organization with an emphasis on computational and mathematical approaches. With chapters written by leading authorities, the Handbook describes modern experimental and computational methods that not only contribute to deciphering the mechanisms of the classical phenomena of perceptual organization but also open new perspectives in what is sometimes called the neo-Gestalt approach to perception. The intended audience includes researchers in psychology, neural science, computer science, and philosophy as well as graduate and advanced undergraduate students in these fields.
Gepshtein S, Singh M, and Maloney LT
|The visual system as economist: neural resource allocation in visual adaptation
Medical Xpress | April 1, 2013
|Despite what you may think, your brain is a mathematical genius
ScienceNewsline | April 10, 2013
|Brain waves challenge area-specific view of brain activity [video 1 2]
KU Leuven | March 20, 2013
October 25, 2013