Jack Gallant, Helen Willis Neuroscience Institute, UC Berkeley
Form vision is mediated by several hierarchically organized cortical areas spanning striate to inferior temporal cortices. We have focused on one intermediate visual area, V4, that plays an important role in form vision. We designed an experiment that permited us to investigate simultaneously how V4 neurons represent form information, how this representation is modulated by attention, and what role V4 plays in decision and memory processes. (1) Objective characterization of nonlinear spatial receptive fields confirms earlier suggestions that V4 neurons represent curvature and other complex features present in natural scenes. (2) Both spatial and feature attention alter the spatial tuning of V4 neurons, presumably to increase perceptual discrimination. (3) V4 neurons appear to play a role in working memory and decision processes that were formally ascribed to inferior temporal and frontal cortex. These findings challenge the common view that V4 is simply a passive information relay between the striate and inferior temporal cortex.
Eero Simoncelli, Center for Neural Science, NYU
I'll describe some recent results on characterization of neural response using stochastic stimuli. The ingredients of the problem consist of a distribution over input stimuli, a model (or class of models) describing the input-output relationship, and a method of estimating the model parameters from the responses (spike trains) of a neuron. I'll start by describing the classical reverse-correlation solution, and then generalize to two new methods that are capable of characterizing a much broader and potentially more realistic set of models.
Janneke Jehee, University of Amsterdam
Recent theories propose that feedforward processing enables rapid and automatic object categorizations, yet incorporating a limited amount of detail. Feedback processing highlights the detailed and complete representation in early visual areas, and provides spatial detail. To verify the hypothesis, we have separated the contributions of feedforward and feedback signals to the selectivity of cortical neurons in a simulation that is modeled after the hierarchical feedforward - feedback organization of cortical areas. In my talk, I will consider the following issues:
- In the recurrent models, the higher areas 'reach back' to lower areas to find high resolution information that was not provided by the feedforward sweep. Given the absence of direct connections between low-level areas such as V1 and motor areas, this implies that the only way to report about these details would be via the higher areas. I will show that, resulting from feedback interactions with lower-level areas, these high-level cells in the end do indeed express high resolution information in their firing patterns.
- A shape can be more difficult to identify when other shapes are near it, a phenomenon known as crowding. If top-down feedback interactions process spatial detail, this suggests that they do not come about for individual items under such conditions. Paradoxically, however, spatial detail that is lost in crowding nevertheless is able to evoke specific adaptation effects. Our simulation captures both the loss of spatial detail, and the paradoxical adaptation to these details under crowding conditions.
Fred Fitzke, Institute of Ophthalmology, University College London
Quantitative imaging of the retina in the living eye has been the subject of significant advances recently. Autofluorescence imaging arising from the "age pigment" lipofuscin in the retinal pigment epithelium (RPE) provides an index of the intimate metabolic support between the photoreceptors and the RPE. In patients with known genetic mutations which lead to abnormalities in fundus autofluorescence and in age related macular degeneration the spatial distribution and the change over time of autofluorescence abnormalities can be characterised in the living human eye. One of the major factors limiting cellular imaging of the retina in patients is the optics of the eye and this is expected to be overcome by developments using adaptive optics. A major recent advance has been the in vivo imaging of apoptosis. For the first time individual retinal ganglion cells undergoing apoptosis have been seen using annexin labelling in the living rat and primate eye. We have observed apoptosis in the same eye over periods of weeks and months and have been able to demonstrate the effects of modulators of apoptosis. These new developments in optical imaging open up new possibilities for understanding the basis of the blinding conditions by imaging the living eye.
Bruno Averbeck, Center for Visual Science, University of Rochester
I will begin by discussing recent work in which we have developed techniques for assessing the effect of patterns of activity across populations of neurons, which we call noise correlations, on the encoding and decoding of behavioral variables. Much past work in this area was based upon phenomenological examples, which suggested that correlations changed depending upon the configuration of visual stimuli. We will show that the presence of correlations does not necessarily imply that they code information about stimuli. Furthermore, the impact of correlations is different depending upon whether one is measuring encoding or decoding.
In the second part of the talk I will present results from our recent neurophysiology experiment, in which monkeys were required to learn sequences of binary decisions. We recorded neural activity in small ensembles of prefrontal cortex neurons while the animals carried out this task, and applied decoding methods to the neural activity recorded while the animals learned the sequences. We found that as the animals learned the sequence for a particular block, evidenced by a decrease in incorrect decisions, their confidence in their decisions increased, evidenced by an increase in the posterior probability of the decision as predicted by the decoding analysis. We were also able to show that monkeys learned more in correct trials than error trials. This suggests that neural activity in prefrontal cortex not only reflects actions, but also the belief that the chosen action is correct.
David Ferster, Neurobiology and Physiology, Northwestern University
One of the central principles of sensory processing is lateral inhibition. Inhibition between sensory neurons with slightly different preferred stimuli is thought to sharpen selectivity. In the retina, for example, lateral inhibition may sharpen the spatial selectivity of adjacent ganglion cells.
In the visual cortex, a widely proposed form of lateral inhibition is cross orientation inhibition: Neurons of different orientation are thought to inhibit one another and thereby sharpen orientation tuning. Indeed, stimuli of the null orientation suppress the responses of cortical neurons to stimuli of the preferred orientation. We have found, however, that this suppression does not arise from intracortical inhibition, but instead originates in the neurons of the LGN. Response rectification and contrast saturation--amplified by spike threshold in cortical cells--can completely account for cross-orientation effects in cortical cells.
From this and other experiments, it seems that lateral inhibition in the orientation or direction domain are not present in the visual cortex. Although the excitatory input to cortical cells is rather weakly tuned for many visual features, the tuning of the cells' output is sharpened not by lateral inhibition, but by the nonspecific effects of threshold, that is, by the iceberg effect.
Peter Sterling, Laboratory of Retinal Microcircuitry, University of Pennsylvania
Synapses release transmitter stochastically, so an input signal relayed across many levels, might gradually be degraded by synaptic noise. The talk will show,to the contrary, that the retina's sensitivity to a small, brief contrast is preserved all the way through the brain to the level of psychophysical detection--except at the ganglion cell, where the conversion from graded potential to spikes causes a two-fold loss.
Partha Mitra, Cold Spring Harbor Laboratory
(co-sponsored by Computer Science)
The dynamic state of the brain is of basic significance: falling asleep or waking up, for example, corresponds to greatly different patterns of electrical activity in the same anatomical substrate. These states are easily distinguished in EEG or MEG spectra, which indicate to us that understanding the (correlated) dynamics of neurons on the scale of 10-100 milliseconds (corresponding to 10-100Hz) is probably central to understanding how the brain works. We have however been severely limited in the past in probing this dynamics. EEG and MEG have rather poor spatial resolution; PET and fMRI have little temporal resolution at these fast time scales; single electrode recordings have focused on receptor field mapping in one form or another, revealing the static topography.
Only recently have we been able to obtain the kind of data that will help us understand how the correlated dynamics of many neurons give rise to our thoughts and feelings. The enabling technological advances include multichannel recordings from microelectrodes implanted in behaving animals, as well as cheap storage and processing power. There is now somewhat of a glut of data and we are becoming limited by our ability to quantify and analyse the multivariate time series that are being fairly routinely gathered in multiple laboratories.
In this talk, we will discuss questions related to the analysis of neural time series data. This will include conceptual issues (what are the right questions to ask?), and technical issues related to quantification of neural dynamics.
Maggie Shiffrar, Psychology, Rutgers University
(co-sponsored by Brain & Cognitive Sciences)
As inherently social animals, humans must accurately perceive and interpret the movements of other people. What processes underlie this perceptual ability? Psychophysical evidence suggests that the visual analysis of human motion can occur over greater spatial and temporal extents than other visual motion analyses. Thus, under some conditions, the visual analysis of human movement differs from other visual motion analyses. Why might this be so? Human action diverges from other categories of visual motion in at least three ways. First, human movement is the only motion that human observers can both produce and perceive. Second, human motion is usually the most frequently occurring category of visual motion. Third, human motion carries more social information than other motion stimuli. In a series of behavioral and brain imaging studies, the potential contributions of these three sources of information, motor system input, perceptual learning, and social processes, on the visual analysis of human movement are considered.
Jelena Jovancevic, Center for Visual Science, University of Rochester
Mary Peterson, Psychology, University of Arizona
An assumption that long served as a foundation for research in visual perception and cognition is that figure-ground segregation precedes access to shape memories, including traces of configured portions of shapes. Contrary to this assumption, experiments measuring perception both implicitly and explicitly show that past experience with particular shapes is among the cues that determine figure assignment, and that these effects can be evident following a single exposure to a novel shape. These results are best understood within a competitive model in which past experience is expressed for portions of configured edges.
Peter Latham, Visiting Scholar, Gatsby Computational Neuroscience Unit, London UK
One of the main goals of neural coding is to to understand how spike trains encode information. For sensory processing, which we will concentrate on here, "understand" is synonymous with "able to translate spike trains into the stimuli that produced them". One of the biggest potential obstacles to this is the existence of correlations -- the fact that, for a given stimulus, spikes both within and across neurons are not independent. (The most common example of correlations is synchrony, which produces sharp peaks in cross-correlograms). This is because correlations make the transformation from stimulus to response high-dimensional, and thus essentially impossible to estimate from data.
There is, however, mounting evidence in mammalian retina, somatosensory cortex, supplementary motor area, and visual cortex, that correlations among pairs of neurons can be ignored without much loss of information. If this result were to extend beyond pairs, to populations, it would greatly simplify the problem of decoding spike trains. Here we discuss two general approaches for assessing the role of correlations, one based on decoding and the other on information theory. We then show that a recently proposed information-theoretic cost function provides an upper bound on the information lost when correlations are ignored. This last result provides us with a code-independent measure that can be used to evaluate the importance of correlations for transmitting information.
Jacob Feldman, Psychology, Rutgers University
In this talk I'll review a series of recent ideas and findings in the mental representation of shape. First, I'll present a framework that puts onto a mathematical footing Attneave's famous observation that information about shape concentrates in regions of high contour curvature, and integrates it with the more recent recognition of the special status of negatively curved (concave) contour regions. Concave regions literally contain more information, in Shannon's sense, than do equally curved convex regions. This difference is manifest in a series of change-detection studies in which subjects were much more able to detect shape changes when they occured in concave regions---even when those changes didn't affect part structure. But part structure is critical in shape representation. In other studies we show an objective psychophysical correlate of the subjective distinction between shape parts: subjects are slower to compare flashed probes when they appear on opposite sides of a concavity (i.e. an apparent boundary) then than when they appear on the same part. But this effect disappears when the concavity is not perceived as a part boundary, as happens with globally bending "snake"-like objects. This points to the importance of the subjective "shape skeleton" as a map to part structure. But traditional methods for computing the medial axis skeleton notoriously don't correlate well with apparent part structure, for example often containing spurious "noise branches." I'll present a new Bayesian framework for estimating the shape skeleton in which we adopt a generative (likelihood) model of shape in which contours are generated from skeletons with orthogonal "ribs" exhibiting some random error, and a prior penalizing skeletal complexity, and then simply estimate the parameters of the shape skeleton that maximizes the posterior probability of the observed contour. I'll show examples suggesting that this procedure gives more intuitive results than traditional medial axis methods, making it more suitable as a basis for psychological shape representation and part decomposition.
Elements of this work are joint with Manish Singh, Elan Barenholtz, and Elias Cohen.
Peter Latham, Visiting Scholar, Gatsby Computational Neuroscience Unit, London UK
The big open question in neural coding is: "what's the neural code?" At the single neuron level, at least in cortex, evidence is mounting that the code is primarily firing rate. For populations, however, the question is wide open. Rate is certainly one possibility, but another is that information is carried in the precise patterns of action potentials. While ultimately the answer will come from experimental data, in the meantime we can approach the problem from a purely theoretical point of view. In particular, if patterns are to carry information, they must be repeatable, and we can ask: does the massively recurrent connectivity in realistic networks place intrinsic limits on repeatability, and thus on the extent to which patterns can carry information?
Using a simple model of randomly connected networks, we find that, at the microscopic level, network dynamics is chaotic. This implies that patterns of spikes are not repeatable (except in the somewhat uninteresting regime in which the input to a network dominates over the recurrent connections). We show in particular that this microscopic chaos has a strong, deleterious effect on the ability of networks to use spike pattern codes to process time-varying stimuli. Moreover, we argue that chaos is a general feature of networks -- it applies even to those that are not random, but have structured connectivity. The upshot of this analysis is that networks are likely to communicate by firing rate, not by detailed patterns of action potentials.
Philip Sabes, Physiology & the Keck Center for Integrative Neuroscience, University of California, San Francisco
Accurate sensorimotor control requires the integration of visual and proprioceptive feedback. The problem of multisensory integration has been well studied by perceptual psychophysicists who have generally found that each sensory modality is weighted according its statistical reliability. However these conclusions have been based on perceptual tasks. We hypothesized that in sensorimotor circuits the integration of sensory signals should depend on how the integrated signals will be used to guide movement. To address this issue, we have studied visual and proprioceptive feedback from the arm in the planning of reaching movements. These studies have led us to a new view of sensory integration as a set of local, independently controlled processes optimized to improve trial-by-trial performance.
Tim Martin, Postdoctoral Candidate, University of New Mexico
The concept of attention as selective filtering and amplification of information implies a change through time, and many dynamic phenomena related to attention have been observed. The determinants of these dynamics are well modeled by two major classes of theory: stochastic interval timing theory and coupled oscillator theory. Although conceptually quite distinct, these two classes of theory make qualitatively similar predictions about performance in most situations. Efforts to distinguish them behaviorally and electrophysiologically will be discussed. The weight of behavioral evidence appears to favor models derived from coupled oscillator theory, but magnetoencephalographic recordings of brain activity have yielded mixed results.
Soyoun Kim, Postdoctoral Candidate
CVS Undergraduate Fellowship Poster Session, Meliora Hall
CVS Picnic, Roundhouse, Genesee Valley Park
Jason Droll, Brain and Cognitive Sciences, University of Rochester (Advisor: Mary Hayhoe)
Attention and working memory set strict limits on visual representations, yet we have little appreciation of how these limits constrain the acquisition of information in ongoing visually-guided behavior. As visually acquired information may represent only a very small subset of the information in a scene, choreographing gaze, attention and working memory is critical for successful visually guided behavior. This dissertation examines how the use of eye movements, visual attention, and working memory are guided by task context and prior knowledge of scene statistics.
In the first series of experiments, subjects performed a brick sorting task in a virtual environment. Hand movements and fixation sequences were used to infer internal operations used throughout the task. To more explicitly test visual memory, a change was made to one of the features of the brick being held on about 10% of trials. Rates of detection for feature changes were generally low, and depended on the pick-up and put-down relevance of the feature to the sorting task. Reasons for missing changes were controlled by manipulating the certainty with which subjects could predict the relevance of the changed feature, suggesting a dynamic use of working memory. Coordination of hand and eye behavior throughout the task was also explored.
In a second series of experiments, subjects detected changes in orientation for abstract shapes across successive frames. As subjects were exposed to the statistics of object changes and object orientation, fixations were directed towards objects whose features were predictive of a visual change. Results suggest that subjects learn the probabilities of change and object features and may combine them using Bayes' rule to form posterior estimates, enabling strategic deployment of gaze when viewing dynamic scenes. Such sophisticated exploitation of environmental probabilities suggests that complex internal models shape decisions about gaze allocation.
David Knill, Center for Visual Science, University of Rochester
Aude Oliva, Massachusetts Institute of Technology
Humans can recognize the gist of a scene (e.g. its semantic category) in a single glance, independent of the complexity of the visual image. How is this remarkable feat accomplished? Research over the last decade has made substantial progress toward understanding the mechanisms underlying human object recognition. However, evidence from behavioral, computational and neuroscience investigations have shown that the perception of real-world scenes may engage distinct cognitive and neural mechanisms from those used in object recognition. In this talk, I will present a series of behavioral experiments dedicated to test the representation and the mechanisms that human observers may use to interpret, within a glance, the visual gist of a scene as well as the implications of scene gist on object detection and the visual exploration of a scene.
How does the visual system decide what percept to construct from measured signals? After Pavlov, it seemed clear that the visual system should be trainable by means of paired association (Fieandt, 1936; Hebb, 1949; Brunswik, 1953, 1956; Smedslund, 1955). But this prediction was not confirmed and today perceptual learning is often defined as an improvement in the ability to discriminate that comes with practice. Yet instances of associative learning have been documented for perceptual appearance and the modern view of perception as near-optimal inference would seem to require this form of learning. We developed the "cue recruitment" experiment, an adaptation of Pavlov's (1927) classical conditioning paradigm, to quantify such learning. Trainees in our experiments viewed perceptually bistable Necker cube stimuli. On training trials the percept was forced by the addition of trusted cues (stereo and occlusion). Critically, arbitrary signals were also added, contingent on the trusted cues. On test trials stimuli contained the new signals but not the trusted cues. In some cases the new signals acquired the ability to disambiguate the percept on its own. As with other forms of learning from classical conditioning, this learning grew incrementally, lasted, and interfered with subsequent learning when the contingency was reversed. The results were consistent across trainees and we have ruled out alternative explanations other than change of appearance (such as change of cognitive strategy or response bias). Possible reasons for previous failures to see cue recruitment include slow rates of learning and masking by negative contingent sensory adaptation aftereffects. Cue recruitment experiments provide a new method for studying how things come to look the way they do.
Frederic Theunissen, University of California at Berkeley
Natural sounds, including animal vocalizations and human speech, are characterized by the temporal and spectral fluctuations found in the envelope of the sound pressure waveform. We found that the ensemble response of auditory neurons is tuned to the spectral and temp features that enhance the acoustic differences between classes of natural sounds. Tuning specifically avoids the temporal modulations that are redundant across natural sounds and therefore provide little information. Tuning overlaps with the temporal modulations that differ most across sounds. In addition, information theoretic analyses show that single neurons are transmitting more information about sounds that have natural spectral-temporal structure than other complex synthetics sounds. Finally, at higher levels of the auditory system, we found that neurons are expecting certain natural statistics and that their response is proportional to deviations from these expectations or surprises! These results suggest that the auditory system has evolved or develops to optimally process behaviorally relevant sounds.
Benjamin Kuipers, University of Texas at Austin
A map is a description of an environment allowing an agent, human or robot, to plan effective actions within the environment, particularly motion from place to place. In a complex, large-scale environment, the structure as a whole cannot be observed by the robot's sensors from a single vantage point, so the robot must create a map from observations gathered over time and travel.
Purely metrical approaches to environmental mapping tyically define a single global frame of reference for the map to be created, and then do probabilistic inference from observations to determine cell occupancy or feature poses. Global consistency problems rapidly arise, particularly when closing large loops of travel in the environment. The fundamental problem is representational: loop-closing hypotheses should be represented as alternative topological structures for the map, not as alternatives in the vastly larger space of metrical maps.
Our Hybrid Spatial Semantic Hierarchy approach combines topological and metrical representations, and applies them to both large-scale and small-scale space. The Hybrid SSH factors the mapping problem into three natural sub-problems. The first, building metrical maps of local, small-scale environments, is solved effectively by existing SLAM methods. The second, building a global topological map of large-scale space, is solved as a search over the space of topological maps consistent with exploration experience. The third, building the global metrical map, is solved efficiently and accurately once we have the topological skeleton to build on. The Hybrid SSH specifies the abstractions that link these different representations, the synergies they support, and the (weak) assumptions they make about the robot's sensory and motor capabilities.
Artur Cideciyan, Scheie Eye Institute, University of Pennsylvania
co-sponsored by Ophthalmology
Human retina is lined with 100 million rod photoreceptor cells that mediate night vision and 5-6 million cone photoreceptor cells that are used for day and color vision. Mutations in genes that are preferentially expressed in photoreceptors or adjoining RPE cells result in hereditary forms of progressive retinal degeneration. Perhaps unexpectedly, the macro-topography of these diseases rarely follows simple cell-density gradients. Furthermore, micro-topography of cell loss suggests underlying molecular heterogeneity of cells that otherwise appear homogenous. This talk will present results from different retinopathies that show topographical distribution of disease severity and discuss possible underlying hypotheses that may be consistent with the observed variation.
Jennifer Hunter, University of Waterloo
Amazingly, an eye will grow at a rate which balances the refractive power of the optics in order to maintain clear vision. The process by which this occurs is known as emmetropization. The underlying mechanism that identifies the direction of defocus and directs eye growth during emmetropization is currently unknown. Monochromatic aberrations (optical imperfections) increase the blur on the retina above that due to defocus alone and introduce a potential signal in the blur with respect to the direction of defocus. It is possible that monochromatic aberrations are responsible for signalling the normal eye growth associated with emmetropization. The first step towards testing this possibility is to measure the monochromatic aberrations in a common model of myopia, the chick.
On the first day post-hatching, 16 chicks were unilaterally fitted with -15D goggles. On several occasions during the first 14 days, goggles were removed for brief periods of time for Hartmann-Shack wavefront measurements. For the largest common pupil size in each eye for each day, point spread functions (PSF's) were calculated for higher order aberrations alone and in combination with astigmatism and defocus terms for goggled and control chick eyes. In the case of the goggled eyes, the defocus was that of the eye in combination with the goggle. Both traditional and new metrics describing the quality of the PSF's in terms relative to the diffraction-limit and in absolute terms were calculated using Matlab. We found that higher-order root mean square wavefront aberrations (RMSA) in control eyes remained relatively constant with age. Most importantly, higher-order RMSA in goggled eyes beyond day 2 were significantly above those in the control eyes. The PSF's showed improvements with age in control eyes and changes with goggling.
Greg DeAngelis, Washington University in St. Louis
Two main visual cues provide precise quantitative information about the 3D structure of visual scenes: binocular disparity and motion parallax. Although neurons selective for binocular disparity are known to exist in several visual areas, the respective roles of different areas and streams in disparity processing is unclear. I will describe a series of studies which show that area MT contributes to coarse discrimination of absolute disparities, but is not involved in fine discrimination of relative disparities. This task-specific contribution of MT can be understood simply in terms of the representation of absolute vs. relative disparities in MT. Whereas disparity processing has been studied extensively, virtually nothing is known about how neurons compute depth from motion parallax. I will show that many neurons in area MT combine retinal image motion with extraretinal signals to compute the sign of depth from motion parallax. This result establishes a new neural mechanism for computing depth.
Mark Histed, Massachusetts Institute of Technology
Complex goal-directed behaviors extend over time and thus depend on the ability to serially order memories and assemble compound, temporally coordinated, movements. Theories of sequential processing range from simple associative chaining to hierarchical models in which order is encoded explicitly and separately from sequence components. To examine how short-term memory and planning for sequences might be coded, we used microstimulation to perturb neural activity in the supplementary eye fields (SEF) while animals held a sequence of two cued locations in memory over a short delay. We found that stimulation affected the order in which animals saccaded to the locations, but not the memory for which locations were cued, implying that memory for sequential order can be dissociated from that of its components. Stimulation of the frontal eye field (FEF), in contrast, did not produce the same sequential effects as we observed in the SEF. Our data also suggest that the SEF encodes sequences in terms of their endpoints in contralateral space. Finally, I will discuss the coding scheme used in the SEF and FEF by comparing the responses shown by neurons in these areas during the memory period.
Zhaoping Li, University College London
Attentional selection of visual inputs integrates the top-down and the bottom-up mechanisms. Saliency, defined as the extent to which a stimulus attracts attention, provides a useful platform to study the attentional mechanisms. Highly salient visual locations, e.g., a red item among green ones, or a horizontal bar among vertical ones, attract attention through bottom-up or stimulus driven mechanisms. The standard view assumes that visual inputs are processed by separate feature maps such as red and green maps, each for a feature value in a few basic dimensions like color and orientation, which are then summed to a spatial master map of bottom-up saliencies. Any assumptions about this bottom-up saliency map can greatly influence assumptions about the top-down mechanisms, and should therefore be confirmed experimentally. We show, using psychophysical experiments (Zhaoping and May, SFN abstract 2004) on visual search and segmentation tasks, that summations or other simple combinations of the feature maps cannot explain the bottom-up saliency. Instead, a single stage computation by the primary visual cortex, using intra cortical interactions (Li TICS 2002), is adequate to explain the data, including the aspects of the data often associated with visual grouping. While V1 mechanisms suffice to account for our data, our framework does not exclude other cortical areas from contributing additionally to computing the bottom up saliency, and we will discuss when and how it could happen. We will discuss how our work relates to other works, and its implications on the top-down attentional mechanisms.