31st CVS Symposium, Frontiers in Virtual Reality
June 1-3, 2018 at Memorial Art Gallery, Rochester NY
Please note that you may not bring food or drink into the auditorium. Snacks and beverages must be consumed in designated areas only. Other important information on the venue can be found on the logistics page.
Thursday, May 31
7:00 - 9:00 pm—Registration & Welcome Reception, Strathallan Hotel
Friday, June 1
***12pm – 4pm & 5:30pm – 9:00pm: Oculus Rift demo station, Bausch & Lomb Parlor
8:20 - 9:00 am—Registration & Breakfast
9:00 - 9:05 am—Welcome: David Williams, University of Rochester
Improving comfort in augmented- and virtual-reality displays (AR and VR) is a significant challenge. One known source of discomfort is the vergence-accommodation conflict. In AR and VR, the eyes accommodate to a fixed screen distance while converging to the simulated distance of the object of interest. This requires undoing the natural coupling between the two responses and thereby leads to discomfort. We investigated whether various display methods (depth-of-field rendering, focus-adjustable lenses, and monovision) can alleviate the vergence-accommodation conflict. We measured accommodation in a VR setup using those methods. The focus-adjustable-lens method drives accommodation effectively (thereby resolving the vergence-accommodation conflict); the other methods do not. We also show that the ability to drive accommodation correlates significantly with viewer comfort.
In computer graphics, the primary goal for realistic rendering has been to create images that are devoid of optical aberrations. But for displays that are meant to give human viewers realistic experiences (i.e., AR and VR), this goal should change. One should instead produce display images that, when viewed by a normal eye, produce the retinal images that are normally experienced. Creating such images reduces to a deconvolution problem that we have solved accurately for most cases. I will describe results that show that creating blur properly drives the human focusing response while creating blur in conventional fashion does not.
We live in interesting times. Physics, chemistry, and biology are beginning to mesh with computer and social sciences. Advancing technologies driven by this merging of science is enabling new integrated solutions in many spaces. Today researchers are beginning to create and deliver realistic artificial human inputs. These inputs of sight, sound, motion, and touch are being woven together into virtual and augmented reality systems than can start to emulate convincing human perception. While movies have always tried to do this for groups with controlled story telling, AR and VR attempt to take this further. Simultaneously simulating multifaceted content in reaction to individual’s actions, thoughts, needs and wants is the next step in entertainment, information exchange, social interaction and more. We are not there yet, but by conquering further technical challenges we can expect a revolutionary change to the human computer interface and along with it significant opportunities to enhance our lives.
11:05 am - 2:00 pm—Lunch (off site)
Session 1: Multisensory Processing
Chair: Edmund Lalor, University of Rochester
Humans haptically perceive the material properties of objects, such as roughness and compliance, through signals from sensory receptors in skin, muscles, tendons, and joints. Approaches to haptic rendering of material properties operate by stimulating, or attempting to stimulate, some or all of these receptor populations. My talk will describe research on haptic perception of roughness and softness in real objects and surfaces and by rendering with a variety of devices.
“The best technologies make the invisible visible.” -Beau Lotto. My lab studies the principles driving specializations in the human brain and their dependence on specific experiences during development (i.e. critical/sensitive periods) versus learning in the adult brain. Our ERC project focuses on studying Nature vs. Nurture factors in shaping up category selectivity in the human brain. A key part of the project involves the use algorithms which convert visual input to blind using music and sound. From basic science perspective the most intriguing results came from studying blind without any visual experience. We documented that essentially most if not all higher-order ‘visual’ cortices can maintain their anatomically consistent category-selectivity (e.g., for body shapes, letters, numbers and even faces; e.g. Amedi et al TICS 2017) even if the input is provided by an atypical sensory modality learned in adulthood. Our work strongly encourages a paradigm shift in the conceptualization of our sensory brain by suggesting that visual experience during critical periods is not necessary to develop anatomically consistent specializations in higher-order ‘visual’ or ‘auditory’ regions. This also have implications to rehabilitation by suggesting that converging multisensory training is more effective. In the second part of the lecture I will focus on the dorsal visual stream and will focus on navigation in virtual environments. Humans rely on vision as their main sensory channel for spatial tasks and accordingly recruit visual regions during navigation. However, it is unclear if these regions role is mainly as an input channel, or if they also play a modality independent role in spatial processing. Sighted, blind and sighted-blindfolded subjects navigated virtual environments while undergoing fMRI scanning before and after training with an auditory navigation interface. We found that retinotopic regions, including both dorsal stream regions (e.g. V6) and primary regions (e.g. peripheral V1), were recruited for non-visual navigation after training, again demonstrating a modality-independent task-based role even in retinotopic regions. In the last part I will also discuss initial results from our new ERC ExperieSense project. In this project we focus on transmitting invisible topographical information to individuals with sensory deprivation but also augmented topographical information to normally sighted and testing whether novel topographical representations can emerge in the adult brain to input that was never experienced during development (or evolution).
3:00 - 4:00 pm—Coffee break & posters
Session 2: Applications
Chair: Krystel Huxlin, University of Rochester
Virtual reality is ideal for generating photorealistic imagery and binaural audio at low cost, important for context-dependent memory recall in a training program. Physical reality is ideal for tactile interaction, a vital component for developing muscle memory. By combining elements of virtual and physical reality (called "Hybrid Reality"), for example by 3D printing objects of interest with accurate topology, tracking those objects in 3D space, and overlaying photorealistic virtual imagery in a VR headset, it becomes much easier to create immersive simulations with minimal cost and schedule impact, with applications in training, prototype design evaluation, scientific visualization, and human performance study. This talk will showcase projects leveraging Hybrid Reality concepts, including demonstrations of future astronaut training capability, digital lunar terrain field analogs, a space habitat evaluation tool, and a sensorimotor countermeasure against the effects of gravitational transition.
Consumer-level HMDs are adequate for many medical applications. Vivid Vision (VV) takes advantage of their low cost, light weight, and large VR gaming code base to make vision tests and treatments. The company’s software is built using the Unity engine, which allows for multiplatform support.in the Unity framework, allowing it to run on many hardware platforms. New headsets are available every six months or less, which creates interesting challenges within in the medical device space. VV’s flagship product is the commercially available Vivid Vision System, used by more than 120 clinics to test and treat binocular dysfunctions such as convergence difficulties, amblyopia, strabismus, and stereo blindness. VV has recently developed a new, VR-based visual field analyzer.
In the visual system, neural function changes dramatically as people adapt to changes in their visual world. Most past work, however, has altered visual input only over the short-term, typically a few minutes. Our lab uses virtual reality displays to allow subjects to live in, for hours and days at a time, visual worlds manipulated in ways that target known neural populations. One experiment, for example, removed vertical energy from the visual environment, effectively depriving orientation-tuned neurons of input. Results suggest that visual adaptation is surprisingly sophisticated: it has a memory that allows us to readapt more quickly to familiar environments, it acts simultaneously on multiple timescales, and it is sensitive to not only the benefits of plasticity, but also its potential costs. Current research is applying these lessons to studies of amblyopia and macular degeneration.
6:00 - 9:00 pm—Grazing dinner & poster session
Saturday, June 2
***8:20am – 4:30pm: Oculus Rift demo station, Bausch & Lomb Parlor
8:20 - 9:00 am—Registration & Breakfast
Session 3: AR/VR Displays and Optics
Chair: Michael Murdoch, RIT
The ultimate augmented reality (AR) display can be conceived as a transparent interface between the user and the environment—a personal and mobile window that fully integrates real and virtual information such that the virtual world is spatially superimposed on the real world. An AR display tailors light by optical means to present a user with visual information superimposed on spaces, buildings, objects, and people. These displays are powerful and promising because the augmentation of the real world by visual information can take on so many forms. In this talk, we will provide a short historical highlight of early work in optics for AR and engage the audience on the emerging technology of freeform optics that is poised to permeate various approaches to future display technology.
Almost 50 years ago, with the goal of registering dynamic synthetic imagery onto the real world, Ivan Sutherland envisioned a fundamental idea to combine digital displays with conventional optical components in a wearable fashion. Since then, various new advancements in the display engineering domain, and a broader understanding in the vision science domain have led us to computational displays for virtual reality and augmented reality applications. Today, such displays promise a more realistic and comfortable experience through techniques such as additive lightfield displays, holographic displays, always-in-focus displays, discrete multiplane displays, and varifocal displays.
Mixed reality technologies have transformed the way content creators build experiences for their users. Pictures and movies are created from the point of view the artist and the viewer is a passive observer. In contrast, creating compelling experiences in AR/VR requires us to better understand what it means to be an active observer in a complex environment. In this talk, I will present a theoretical framework that describes how AR/VR technologies interface with our sensorimotor system. I will then focus on how, at Oculus Research, we develop new immersive display technologies that support accommodation. I will present a few of our research prototypes and describe how we leverage them to help define requirements for future AR/VR displays.
A longstanding goal in engineering has been to design technologies that are able to reflect the amazing perceptual and motor capabilities of biological systems for touch, including the human hand. This turns out to be very challenging. One reason for this is that, fundamentally, our understanding of what is felt when we touch objects in the world, which is to say haptic stimuli, is fairly limited. This is due in part to the mechanical complexity of touch interactions, the multiple length scales and physical regimes involved, and the sensitive dependence of what we feel on how we touch and explore. I will describe research in my lab on a few related problems, and will explain how the results are informing the development of new technologies for wearable computing, virtual reality, and robotics.
11:00 am - 1:15 pm—Lunch (off site)
Session 4: Space and Navigation
Chair: Greg DeAngelis, University of Rochester
We constantly move from one point to another or navigate in the world: in a room, building or around a city. While navigating, we look around to understand the environment, and our position within it. We use vision naturally and effortlessly to navigate in the world. How does the brain use visual images observed by the eyes for natural functions such as navigation? Research into this area has mostly focused at the two ends of this spectrum: either understanding how visual images are processed, or how navigation related parameters are represented by the brain. However, little is known regarding how visual and navigational areas work together or interact. The focus of my research is to bridge the gap between these two fields of research using a combination of rodent virtual reality, electrophysiology and optogenetic technologies. One of the first steps towards this question is to understand how the visual system functions during navigation. I will describe work on neural coding and brain oscillations in the primary visual cortex during locomotion: we discovered that running speed is represented in the primary visual cortex, and how it is integrated with visual information. I will next describe work on how the visual cortex and hippocampus work in cohesion during goal-directed navigation, based on simultaneous recordings from the two areas. We find that both these areas make correlated errors and display neural correlates of behaviour. I will finally show some preliminary work on information processing in areas intermediate to the primary visual cortex and the hippocampus.
Devices like head-mounted displays and omnidirectional treadmills offer enormous potential for gaming and networking-related applications. However, their use in experimental psychology and cognitive neuroscience, so far, have been relatively limited. One of clearest applications of such novel devices is the study of human spatial navigation, historically an understudied area compared to more experimentally-constrainable studies in rodents. Here, we present several experiments the lab has conducted using VR/AR, and describe the novel insights they provide into how we navigate. We also discuss how such devices, when combined with functional magnetic resonance imaging (fMRI) and wireless scalp EEG, also provide new insights into the neural basis of human spatial navigation.
Search is a central visual function. Most of what is known about search derives from experiments where subjects view 2D displays on computer monitors. In the natural world, however, search involves movement of the body in large-scale spatial contexts and it is unclear how this might affect search strategies. In this experiment, we explore the nature of memory representations developed when searching in an immersive virtual environment. By manipulating target location, we demonstrated that search depends on episodic spatial memory as well as learnt spatial priors. Subjects rapidly learned the large-scale structure of the space, with shorter paths and less head rotation to find targets. These results suggest that spatial memory of the global structure allows a search strategy that involves efficient attention allocation based on the relevance of scene regions. Therefore the costs of moving the body may need be considered as a factor in the search process.
3:00 - 4:00 pm—Coffee break & posters
Session 5: Perception and Action
Chair: Martina Poletti, University of Rochester
The utility of immersive virtual environments (VEs) for many applications increases when viewers perceive the scale of the environment as similar to the real world. Systematic study of human performance in VEs, especially in studies of perceived action capabilities and perceptual-motor adaptation, has increased our understanding of how adults perceive and act in VEs. Research with children has just begun, thanks to new commodity-level head-mounted-displays suitable for children with smaller heads and bodies. Children's perception and action in VEs is particularly important to study, not only because children will be active consumers of VEs but also because children's rapidly changing bodies likely influence how they perceive and adapt their actions. I will present an overview of our approach to studying children and teens in a variety of tasks involving perceived affordances and recalibration in VEs, showing both similarities and differences across age groups.
It is known that the head and eyes function synergistically to collect task-relevant visual information needed to guide action. Although advances in mobile eye tracking and wearable sensors have now made it possible to collect data about eye and head pose while subjects explore the three-dimensional environment, algorithms for data interpretation remain relatively underdeveloped. For example, almost all gaze event classifiers algorithmically define fixation as a period when the eye-in-head velocity signal is stable. However, when the head can move, fixations also arise from coordinated movements of the eyes and head, for example, through the vestibulo-ocular reflex. Thus to identify fixations when the head is free requires that one accounts for head rotation. Our approach was to instrument multiple subjects with a hat-mounted 2D RGB stereo camera, a 6-axis inertial measurement unit, and a 200 Hz Pupil Labs eye tracker to record angular velocity of the eyes and head as they performed a variety of tasks that involve coordinated movements of the eyes and head. These tasks, include walking through a corridor, making tea, catching a ball, and performing a simple visual search task. Four trained labelers manually annotated a portion of the dataset as periods of gaze fixations (GF), gaze pursuits (GP), and gaze shifts (GS). In this presentation, I will report some of our initial findings from our efforts to understand the principles of coordination between the eyes and head outside of the laboratory. In addition, I will report current progress towards training a Forward-Backward Recurrent Window (FBRW) classifier for the automated classification of gaze events hidden within the eye+head velocity signals.
Psychologists and neuroimagers commonly study perceptual and cognitive processes using images because of the convenience and ease of experimental control they provide. However, real objects differ from pictures in many ways, including the availability and consistency of depth cues and the potential for interaction. Across a series of neuroimaging and behavioral experiments, we have shown different responses to real objects than pictures, in terms of the level and pattern of brain activation as well as visual preferences. Now that these results have shown quantitative and qualitative differences in the processing of real objects and images, the next step is to determine which aspects of real objects drive these differences. Virtual and augmented reality environments provide an interesting approach to determine which aspects matter; moreover, knowing which aspects matter can inform the development of such environments.
6:00 - 9:00 pm—Banquet
Sunday, June 3
9:00 - 9:30 am—Breakfast
Session 6: Visual Perception
Chair: Gabriel Diaz, Rochester Institute of Technology
As robots become more human-like our appreciation of them increases — up to a crucial point where we find them realistic but not *perfectly* so. At this point, human preference plummets into the so-called *uncanny valley.* This phenomenon isn’t limited to robotics and has been observed in many other areas. These include the fine arts, especially photorealistic painting, sculpture, computer graphics, and animation. The informal heuristic practices of the fine arts, *especially* those of traditional animation, have much to offer to our understanding of the appearance of phenomenological reality. One interesting example is the use of *exaggeration* to mitigate uncanny valley phenomena in animation. Raw rotoscoped imagery (e.g., action captured from live performance) is frequently exaggerated to give the motion ‘more life’ so as to appear less uncanny.
We performed a series of experiments to test the effects of exaggeration on the phenomenological perception of simple animated objects — bouncing balls. A physically plausible model of a bouncing ball was augmented with a frequently used form of exaggeration known as *squash and stretch.* Subjects were shown a series of animated balls, depicted using systematic parameterizations of the model, and asked to rate their plausibility. A range of rendering styles provided varying levels of information as to the type of ball. In all cases, balls with no exaggeration (e.g., veridically) were seen as significantly less plausible than those with it. Furthermore, when the type of ball was not specified, subjects tolerated a large amount of exaggeration before judging them as implausible. When the type of ball was indicated, subjects narrowed the range of acceptable exaggeration somewhat but still tolerated exaggeration well beyond that which would be physically possible. We contend that, in this case, exaggeration acts to bridge the uncanny valley for artificial depictions of physical reality.
Recovering shape or reflectance from an object's image is under-constrained: effects of shape, reflectance and illumination are confounded in the image. We overcome this ambiguity by (i) exploiting prior knowledge about the statistical regularities of our environment (e.g. light tends to come from above) and (ii) combining sensory cues both within vision and across modalties.
I will discuss a collection of studies that reveal the assumptions that we hold about natural illumination. When visual scenes are rendered in a way that violates these assumptions our perception becomes distorted. For example, failing to preserve the high dynamic range of illumination reduces perceived gloss.
In addition, I discuss two quite different ways in which touch cues interact with vision to modulate material perception. First, objects can 'feel' shiny; surfaces that are more slippery to the touch are perceived as more glossy. Second, touch disambiguates the perceived shape of a bistable shaded image. The haptically-induced change in shape is accompanied by a switch in material perception - a matte surface becomes glossy."
10:30 - 11:00 am—Coffee break
Over the past 25 years, studies of stereoscopic depth perception have largely been dominated by its precision. However, it is arguable that suprathreshold properties of stereopsis are just as relevant, if not more so, to natural tasks such as navigation and grasping. In this presentation, I will review several studies in which we have assessed depth magnitude percepts from stereopsis. I will highlight factors that impact perceived depth in 3D display systems such as prior experience, and the richness of additional depth cues.
Virtual reality (VR) displays can be used to present visual stimuli in naturalistic 3D environments. Little is known however, about our sensitivity to sensory cues in such environments.
Traditional vision research has relied on head-fixed observers viewing stimuli on flat 2D displays. Under such conditions many sensory cues are either in conflict, or entirely lacking. For example, an optic flow field will contain conflicting binocular cues, and lack motion parallax cues. We therefore investigated sensory sensitivity to cues that signal 3D motion in VR. We found considerable variability in cue sensitivity both within and between observers.
Next we investigated the possible relationship between cue sensitivity and motion sickness. Prior work has hypothesized that motion sickness stems from factors related to self-motion and that there are inherent gender differences in VR tolerance (e.g., Riccio and Stoffregen, 1991). We hypothesized that the discomfort is caused by sensory cue conflicts, which implies that a person's susceptibility to motion sickness can be predicted based on their cue sensitivity.
We found that greater cue sensitivity predicted motion sickness, supporting the cue conflict hypothesis. Inconsistent with prior work, we did not find gender differences: females did not show evidence of greater motion sickness. We speculate that prior VR results may be related to use of a fixed inter pupillary distance (IPD) for all observers.
Our results indicate much greater variability in sensory sensitivity to 3D motion in VR, than might be expected based on prior research on 2D motion. Moreover, our findings suggest motion sickness can be attenuated by eliminating or reducing specific sensory cues in VR.
12:00 - 1:00 pm—Box Lunch (on site)