The 2019 Workshop:

Explaining the Human Visual Brain

Dates: July 19-20, 2019
Place: MIT, Cambridge, MA

Invited Speakers



Workshop Schedule

Friday July 19: Tutorials

Location: MIT Building 46, Singleton Auditorium, 46-3002

12:30 pm – 1:00 pmRegistration and Refreshments / Opening Remarks
1:00 pm – 2:00 pmGemma Roig – Introduction to Neural Networks
2:00 pm – 2:15 pmBreak and Refreshments
2:15 pm – 3:15 pmYalda Mohsenzadeh – Introduction to Brain Imaging: fMRI and MEG/EEG
3:15 pm – 3:30 pmBreak and Refreshments
3:30 pm – 4:30 pmMartin Hebart – Comparing Brains and DNNs: Methods and Findings
4:30 pm – 4:45 pmBreak and Refreshments
4:45 pm – 5:45 pmRadoslaw Cichy – Comparing Brains and DNNs: Theory of Science
5:45 pm – 6:00 pmSummary

Saturday July 20: Workshop

Location: MIT Building 46, Singleton Auditorium, 46-3002

8:30 am – 9:00 amBreakfast
9:00 am – 9:15 amIntroduction by Radoslaw Cichy
[Slides] [Video]
9:15 am – 9:35 amMatt Botvinick – Toward Object-Oriented Deep Reinforcement Learning
[Slides] [Video]
9:35 am – 9:55 amAude Oliva – Interpretability and Visualization of Deep Neural Networks
[Slides] [Video N/A]
9:55 am – 10:15 amThomas Naselaris – Deep Generative Networks as Models of the Visual System
[Slides] [Video]
10:15 am – 11:00 amPosters and Coffee
11:00 am – 11:20 amDavid Cox – Predictive Coding Models of Perception
[Slides N/A] [Video N/A]
11:20 am – 11:40 amJames DiCarlo – Brain Benchmarking Our Way to an Understanding of Visual Intelligence
[Slides N/A] [Video N/A]
11:40 am – 12:00 pmKendrick Kay – The Natural Scenes Dataset: Massive High-Quality Whole-Brain 7T fMRI Measurements During Visual Perception and Memory
[Slides] [Video]
12:00 pm – 1:30 pmLunch on Your Own
1:30 pm – 1:50 pmIntroduction to the Algonauts Challenge by Radoslaw Cichy
[Slides] [Video]
1:50 pm – 2:50 pmInvited Talks: Challenge Winners

Agustin Lage Castellanos (agustin) [Report] [Slides]
Aakash Agrawal (Aakash) [Report] [Slides]
Romuald Janik (rmldj) [Report] [Slides]
2:50 pm – 3:30 pmPosters and Coffee
3:30 pm – 3:50 pmTalia Konkle – Response Preferences vs Patterns: Insights from Deep Neural Networks
[Slides N/A] [Video N/A]
3:50 pm – 4:10 pmNikolaus Kriegeskorte – Cognitive Computational Neuroscience of Vision
[Slides] [Video]
4:10 pm – 4:30 pmJack Gallant – Taking Natural Scene Statistics into Account when Evaluating Brain Data and Models
[Slides N/A] [Video N/A]
4:30 pm – 5:00 pmPanel Discussion with Speakers – Moderated by Gemma Roig & Radoslaw Cichy
5:00 pm – 6:30 pmReception and Refreshments

Invited Speaker Abstracts

Matt Botvinick
Title: Toward object-oriented deep reinforcement learning
Abstract: Deep reinforcement learning has revolutionized AI research by generating super-human performance in tasks ranging from Atari to go and chess to the video-game StarCraft. From a neuroscientist's point of view, it is gratifying to note that Insights from visual neuroscience have played a key role in driving these technological advances. A number of considerations suggest that, to proceed further, deep reinforcement learning may have to draw a further lesson from visual neuroscience, namely, the central role of objects. I'll review recent work suggesting that object-level representation may be a critical ingredient for deep RL, and consider some recently developed techniques for extracting objects from visual data, including some proposals from my group at DeepMind.

David Cox
Title: Predictive Coding Models of Perception
Abstract: The ability to predict future states of the world is essential for planning behavior, and it is arguably a central pillar of intelligence. In the field of sensory neuroscience, "predictive coding"—the notion that circuits in cerebral actively predict their own activity—has been an influential theoretical framework for understanding visual cortex. In my talk, I will bring together the idea of predictive coding with modern tools of machine learning to build practical, working vision models that predict their inputs in both space and time. These networks learn to predict future frames in a video sequence, with each layer in the network making local predictions and only forwarding deviations from those predictions to subsequent network layers. We show that these networks are able to robustly learn to predict the movement of synthetic (rendered) objects, and that in doing so, the networks learn internal representations that are useful for decoding latent object parameters (e.g. pose) that support object recognition with fewer training views. We also show that these networks can scale to complex natural image streams (car-mounted camera videos), capturing key aspects of both egocentric movement and the movement of objects in the visual scene, and generalizing well across video datasets. These results suggest that prediction represents a powerful framework for unsupervised learning, allowing for implicit learning of object and scene structure. At the same time, we find that models trained for prediction also recapitulate a wide variety of findings in neuroscience and psychology, providing a touch point between deep learning and empirical neuroscience data.

James DiCarlo
Title: Brain benchmarking our way to an understanding of visual intelligence

Jack Gallant
Title: Taking natural scene statistics into account when evaluating brain data and models
Abstract: Natural scenes (and natural movies) have very a specific statistical structure. The lower-order statistics of natural scenes were characterized explicitly over 20 years ago. Their higher order statistics are largely unknown (though many of these statistics are represented implicitly by deep networks trained to classify natural images). The visual system has evolved and developed to exploit these statistical properties, but the structure and function of the visual system also reflect other evolutionary, computational, behavioral and environmental factors that are not directly tied to stimulus statistics. Therefore, in interpreting results of vision experiments that use natural scenes, it is critical to answer three questions: (1) Does the result merely reflect a tendency for vision to exploit and mirror scene statistics? (2) Does this result reflect a case where vision over- or under- represents scene statistics? (3) Does the result reflect a case where vision works in opposition to scene statistics? Each of these three cases suggests a very different reason or cause for the observed result. Therefore, the answer to these questions will dramatically change interpretation of results, and researchers ignore these key distinctions at their peril.

Kendrick Kay
Title: The Natural Scenes Dataset: massive high-quality whole-brain 7T fMRI measurements during visual perception and memory
Abstract: Access to high-quality data is essential for developing better models of visual information processing. Here, we describe an ambitious experiment in which ultra-high-field fMRI measurements are made (7T, whole-brain, T2*-weighted gradient-echo EPI, 1.8-mm resolution, 1.6-s TR) while 8 carefully selected and trained human participants view many thousands of color natural scenes over the course of 40 scan sessions held throughout nearly a year. In the experiment, subjects fixate centrally and perform a continuous recognition task in which they judge whether they have seen each given image at any point either in the current scan session or any previous scan session. I will describe the design of the experiment, the types of neuroimaging and behavioral measures that are collected, the current state of data acquisition, and the signal processing techniques we have developed to maximize signal-to-noise ratio. Preliminary analyses indicate that the data are of excellent quality, including nearly perfect response rates, high recognition performance, spatially stable brain imaging across scan sessions, and highly replicable brain activity patterns across repeated trials of the same image. The data (raw and pre-processed) will be made publicly available to the scientific community, and could be used to develop and benchmark models as well as answer a variety of neuroscientific questions. Finally, I will conclude with some brief comments regarding goals and desiderata for modeling efforts.

Talia Konkle
Title: Response Preferences vs Patterns: Insights from deep neural networks
Abstract: Object representations are housed in the occipitotemporal cortex of the human brain, where a few focal regions respond relatively selectively to some categories—faces, houses, bodies, as evident in univariate responses. However, in recent empirical work, we found that the representational geometries of these regions can be quite strongly correlated with each other, as well as with cortex outside of these regions (Cohen et al., 2017). Here, we leverage deep neural networks to provide some insight into this unexpected empirical result, defining and probing deep net category-selective "regions". We found that across deepnet face and place regions, representational geometries were also relatively similar to each other, mirroring data from human brain responses. Based on these results, I will discuss the idea that the similarity of representational geometries evident in these brain regions indicate that the entire occipito-temporal cortex participates as one discriminative feature bank with a common representational constraint, while the selectivity of different parts of the cortex reveal how this space is mapped across the cortex for read-out mechanisms.

Nikolaus Kriegeskorte
Title: Cognitive computational neuroscience of vision
Abstract: To learn how cognition is implemented in the brain, we must build computational models that can perform cognitive tasks, and test such models with brain and behavioral experiments [1]. Modern technologies enable us to measure and manipulate brain activity in unprecedentedly rich ways in animals and humans. However, experiments will yield theoretical insight only when employed to test brain computational models. Recent advances in neural network modelling have enabled major strides in computer vision and other artificial intelligence applications. This brain-inspired technology provides the basis for tomorrow’s computational neuroscience [1, 2]. Deep convolutional neural nets trained for visual object recognition have internal representational spaces remarkably similar to those of the human and monkey ventral visual pathway [3]. Functional imaging and invasive neuronal recording provide rich brain activity measurements in humans and animals, but a challenge is to leverage such data to gain insight into the brain’s computational mechanisms [4, 5]. We build neural network models of primate vision, inspired by biology and guided by engineering considerations [2, 6]. We also develop statistical inference techniques that enable us to adjudicate between complex brain-computational models on the basis of brain and behavioral data [4, 5]. I will discuss recent work extending deep convolutional feedforward vision models by adding recurrent signal flow and stochasticity. These characteristics of biological neural networks may improve inferential performance and enable neural networks to more accurately represent their own uncertainty.

Thomas Naselaris
Title: Deep generative networks as models of the visual system
Abstract: We will discuss the merits of deep generative networks—such as the variational autoencoder—as abstract models of computation in the human visual system. Unlike deep discriminative networks, deep generative networks minimize an unsupervised cost function and provide a natural framework for relating top-down and bottom-up signals. We will show that generative networks also provide a ready-made theory of mental imagery—an essential capacity of the human visual system that cannot be properly accounted for by discriminative networks. We will then discuss the limitations of extant deep generative networks as the basis for encoding models of brain responses during vision, and discuss the prospects for overcoming those limitations with a new large-scale data collection effort.

Aude Oliva
Title: Interpretability and Visualization of Deep Neural Networks

Important Dates

Poster Abstract Submission

We accept abstract submissions for posters about relevant work related to the workshop. Abstracts related to submissions to the Algonauts Challenge are also eligible to be selected. There is no need to have a submission to the challenge to be selected to present an abstract. Selected abstracts will be invited to present as a poster in the workshop. If selected, at least one author has to register to the workshop before the registration deadline, and is expected to attend the workshop to present the poster.

Abstracts should be of maximum 500 words. The deadline for poster abstract submission is July 16, 2019.

For submitting an abstract use this form, and follow the instructions.

Poster Presentation

Your poster should fit within a poster board of 4 ft height × 6 ft width (121 cm × 188 cm).






Event Planners