top of page


March 10th, 2021

Failures of perception in humans and machines

Emily Ward

Abstract: It is tempting to think that we see the world as it really is. But in fact, we often perceive things that don’t actually exist, while often failing to perceive what is plainly in sight. To understand this curious feature of visual perception, my research has focused on the surprising ways in which conscious experience of the world is dissociated from the actual nature of the world. In particular, why does perception fail us in the first place, and how can we predict when such failures may occur? And why does our experience of the world seem so rich, especially in light of such failures of perception? I will discuss a series of studies that aim to answer these questions, as well as some new research that explores the genesis of failures of perception by investigating whether visual illusions and change blindness occur in artificial deep neural networks.

Link to Zoom Recording of Emily's Talk 

Suggested Readings:

Ward, E. J. (2019). Exploring perceptual illusions in deep neural networks. bioRxiv, 687905. PDF

February 10th, 2021

Impact of statistical regularities in visual experience: from face perception to social cognition

Ipek Oruc

Abstract: The corpus of lifetime visual exposure of an adult observer contains statistical regularities across a diverse hierarchy of image properties. From canonical retinal sizes and viewing distances (e.g., the sun is always viewed at infinity and a face is viewed most often at about 1 m) to frequently encountered stimulus traits (e.g., cardinal orientations dominate image features and own-race faces dominate the face-diet), these regularities pervade our visual experience. The metaphorical landscape of visual experience creates channels that shape, lead and inform processes of visual perception, allowing the observer to infer the true nature of the distal stimulus with greater success compared to a uniform landscape where all visual traits are equally probable. In this talk, I will give you a short overview of some recent data from my lab on statistical regularities in adult face exposure, their relationship to face recognition performance, and some hypotheses that we have considered as a framework to interpret these data.


Link to Zoom Recording of Ipek's Talk 


Suggested Readings:

Oruc, I., Shafai, F., Murthy, S., Lages, P., & Ton, T. (2019). The adult face-diet: A naturalistic observation study. Vision research, 157, 222-229.  PDF 

Mousavi, S. M., & Oruc, I. (2020). Tuning of face expertise with a racially heterogeneous face-diet. Visual Cognition, 28(9), 523-539.  PDF

Oruc, I., Shafai, F., & Iarocci, G. (2018). Link between facial identity and expression abilities suggestive of origins of face impairments in autism: Support for the social-motivation hypothesis. Psychological science, 29(11), 1859-1867. PDF

January 13th, 2021

Dissociated intracerebral face, word, and building responses in the human ventral occipito-temporal cortex.

Simen Hagen

Abstract: Recognition of visual objects across accidental variability (e.g., view-points, illumination, distance) is fundamental to human behavior, and so its underlying neural substrates have been studied extensively with fMRI, revealing selective neural responses to a range of ecologically important objects (e.g., faces>objects; written words>objects) in the ventral occipito-temporal cortex (VOTC). However, since the indirect neural measure of fMRI suffers from non-uniformly distributed magnetic artefacts across the VOTC - thereby potentially missing category selective responses in certain regions (e.g., vATL) - it is important to complement these studies with direct neural recordings in humans. Here I will present work where we isolate category-selective responses using a highly sensitive frequency-tagging approach in combination with direct neural intracerebral recordings in the VOTC of a large group of awake human patients.

Link to Zoom Recording of Simen's Talk 

Suggested readings:

Jonas, Jacques, Corentin Jacques, Joan Liu-Shuang, Hélène Brissart, Sophie Colnat-Coulbois, Louis Maillard, and Bruno Rossion. "A face-selective ventral occipito-temporal map of the human brain with intracerebral potentials." Proceedings of the National Academy of Sciences 113, no. 28 (2016): E4088-E4097.

November 11th, 2020

Putting people in context: N190 responses to bodies in natural scenes

Benjamin Balas

Abstract: The N190 ERP component is a body-sensitive response that has been examined in much the same way as other high-level visual components. That is, multiple studies have manipulated properties of body images and/or introduced control stimuli that are meant to test hypotheses regarding the critical features for stimuli to be processed as bodies by the ventral visual stream. The majority of these studies (my own included!) involve presenting segmented body images to observers that are free of clutter and context, isolating the stimulus of interest to the target component. Of course, these aren’t typical viewing conditions for bodies that we see in natural environments, so how do body-sensitive responses change if we present people in places? I’ll describe the results of two separate ERP tasks where we used the Penn-Fudan pedestrian database to examine how natural scene structure (including the presence of other bodies in scenes) affects the N190. I’ll also talk more broadly about other active projects in my lab that are hopefully moving us further towards characterizing high-level vision in natural settings.


Link to Zoom Recording of Ben's Talk

Suggested Readings:

Thierry, G., Pegna, A. J., Dodds, C., Roberts, M., Basan, S., & Downing, P. (2006). An event-related potential component sensitive to images of the human body. Neuroimage, 32(2), 871-879.


Minnebusch, D. A., Suchan, B., & Daum, I. (2008). Losing your head: behavioral and electrophysiological effects of body inversion. Journal of Cognitive Neuroscience, 21(5), 865-874.

October 14th, 2020

Multi-sensory perception and recognition

Quoc Vuong

Abstract: The senses do not work in isolation. One of our lab aims is to investigate how they work together for perception and recognition. In a first study (Kikuchi et al., 2019), we tested voice and face processing separately. We used fMRI adaptation to show that voice and face brain regions responded in a similar way to morphed voices and faces, respectively. In a second series of experiments (Laing et al., 2015; Vuong et al., 2019), we tested auditory-visual integration. We created auditory and visual stimuli that modulated sinusoidally over time along analogous dimensions in hearing (i.e. loudness) and vision (i.e. size). These stimuli allowed us to combine auditory and visual information in different ways to test auditory-visual integration. We found that (all else being equal), observers relied more on vision than hearing during perceptual discrimination tasks (yay!). Using these stimuli, we further found that functional brain connectivity is one potential neural mechanism for auditory-visual integration. Some exploratory work to combine the modulated stimuli with faces and speech will also be presented for discussion.


Link to Zoom Recording of Quoc's Talk

Suggested Readings:

Kikuchi, Y., et al. (2019). Interactions between conscious and subconscious signals: Selective attention under feature-based competition increases neural selectivity during brain adaptation. The Journal of Neuroscience, 39(28), 5506-5516.

Vuong, Q. C., Laing, M., Prabhu, A., Tung, H. I., & Rees, A. (2019). Modulated stimuli demonstrate asymmetric interactions between hearing and vision. Scientific Reports, 9(1), 7605.


Laing, M., Rees, A., & Vuong, Q. C. (2015). Amplitude-modulated stimuli reveal auditory-visual interactions in brain activity and brain connectivity. Frontiers in Psychology, 6:1440.

September 9th, 2020

Gist in Medical Image Perception and Advancing Early Cancer Detection

Karla K. Evans

Abstract: We are all experts at rapid perception of scene gist. Medical experts can detect the gist of medical images; e.g. discriminating normal from abnormal mammograms at above chance levels after 500ms exposure even when the signs of cancer are quite subtle (Evans et al, 2013). Under these conditions, localization of any lesion is at chance, suggesting that a global/texture signal underpins the detection of these subtle abnormalities. Gist classification is possible in images from the normal breast contralateral to the breast with overt signs of cancer (Evans et al, 2016). I further show that expert observers can also detect the gist signal 3 years before the cancer, itself, appears (Evans et al, 2019). This is not due to a few salient cases nor to breast density, a known risk factor. The ability is related to perceptual expertise as quantified by the number of mammographic cases read within a year. Radiologists have access to a global, non-selective signal of abnormality that could serve as a perceptual ‘risk factor’ for cancer. Lastly, I will show what we know about the nature of this signal so far from examining human perceptual behaviour. If this signal could be reliably detected by humans or by computational systems, it could be a valuable part of the effort to assess an individual woman’s risk factors and detect cancer early.

Link to Zoom Recording of Karla's Talk

August 12th, 2020

Turning the black box white: how face recognition works in a deep convolutional neural network

Y. Ivette Colon, Matthew Q. Hill, Connor J. Parde & Alice J. O’Toole

Abstract: Deep convolutional neural networks (DCNNs) trained for face identification have approached, and in some cases, surpassed human performance (e.g., Phillips et al., 2018). These networks recognize faces accurately across image (e.g., viewpoint, illumination) and appearance variation, using cascaded layers of local computations, modeled after the primate visual system. Notably, representations produced by DCNNs trained for face identification retain a surprising amount of “identity-irrelevant” information (e.g., gender, illumination, viewpoint). We will present three projects aimed at understanding how these representations accommodate diverse information about faces, while supporting remarkably accurate identification.  In the first project, we visualized the “face space” at the top-layer of a face identification deep net (Hill et al., 2019). We probed an “in-the-wild” network with an “in-the-lab” dataset  composed of 3D laser scans, rendered from 5 viewpoints under two illumination conditions.  This revealed a hierarchical arrangement of image attributes within identity clusters. Caricatures of the heads, which varied in identity strength from the average, provided insight into how identity and distinctiveness can be understood in this new face space.  In the second project, we analyzed relationship between the single-unit and ensemble codes used in a face identification deep net (Parde et al., 2020). At the single-unit level, information face attributes (identity, gender, viewpoint) were confounded in the responses of the individual units. At the ensemble level, these attributes separated in high-dimensional subspaces, ordered by explained variance. These results indicate that “meaning” is encoded by directions in the high-dimensional space—it cannot be inferred from the responses of single “neural” units. In the third project, we examined the coding of facial expression in a DCNN (Colon et al., 2020). The network accurately classified 7 expressions (sad, happy, fear, surprise, anger,) from 5 viewpoints, with performance that roughly mirrored human accuracy, with some (interesting!) exceptions.


Link to Zoom Recording of Alice's Labs Talk


Suggested Readings:

Colon, Y. I., Castillo, C. D. & O’Toole, A. J. (2020). Facial expression is retained in deep networks trained for face identification.

Hill, M. Q., Parde, C. J., Castillo, C. D., Colon, Y. I., Ranjan, R., Chen, J. C., Blanz, V. O’Toole, A. J. (2019) Deep convolutional neural networks in the face of caricature Nature Machine Intelligence.

Parde, C. J., Colón, Y. I., Castillo, C. D., Dhar, P. O’Toole, A.J. (2020). Single unit status in deep convolutional neural network codes for faces.  arXiv:2002.06274

Colon, Y. I., Castillo, C. D. & O’Toole, A. J. (2020). Facial expression is retained in deep networks trained for face identification.

Phillips, P. J., Yates, A. N., Hu, Y., Hahn, C. A., Noyes, E. Jackson, K. Cavazos, J. G., Jeckeln, G., Ranjan, R., Sankaranarayarnan, S., Chen, J.C., Castillo, C., Chellappa, R. White, D., & O’Toole, A. J. (2018). Face Recognition Accuracy of Forensic Examiners, Super-recognizers, and Algorithms. Proceedings of the National Academy of Sciences,

July 8th, 2020

Why faking it isn’t making it in facial expression research

Amy Dawel & Liz Miller

Abstract: Despite the longstanding and widespread interest in how people perceive others’ emotions from facial expressions, much of the empirical data comes from a small number of artificially posed stimuli (e.g., the Ekman faces), validated only by high levels of agreement about what emotion they are showing (e.g., labeled as angry, happy, sad, etc.). This ignores a separate—and potentially critical—dimension of facial expressions: whether or not they are perceived as showing genuine emotion. Here, we present evidence that many of the most popular expression stimuli are perceived as not showing genuine emotion. Using new sets of genuine-posed and naturalistic expression stimuli developed in our lab, we find that perceptions of emotion genuineness influence people’s responses in ways that provide new insights into affective processes (e.g., in social anxiety) and sometimes lead to different research conclusions (e.g., in psychopathic traits). Recently, our lab has begun interrogating what physical information in faces causes expressions to be perceived as genuine versus fake. Our findings challenge existing work, and in doing so highlight another emergent issue in the literature: the use of virtual faces as though they were real human ones. Overall, we argue there are significant theoretical and practical benefits to be gained from using stimuli that cover a fuller range of real-world facial behaviour, including expressions that are perceived as showing genuine emotion.

Link to Zoom Recording of Amy & Liz's Talk 

Suggested Readings:

Dawel, A., Wright, L., Irons, J., Dumbleton, R., Palermo, R., O’Kearney, R., & McKone, E. (2017). Perceived emotion genuineness: normative ratings for popular facial expression stimuli and the development of perceived-as-genuine and perceived-as-fake sets. Behavior Research Methods, 49(4), 1539-1562.


Dawel, A., Dumbleton, R., O’Kearney, R., Wright, L., & McKone, E. (2019). Reduced willingness to approach genuine smilers in social anxiety explained by potential for social evaluation, not misperception of smile authenticity. Cognition and Emotion, 33(7), 1342-1355.


Dawel, A., Wright, L., Dumbleton, R., & McKone, E. (2019). All tears are crocodile tears: Impaired perception of emotion authenticity in psychopathic traits. Personality Disorders: Theory, Research, and Treatment, 10(2), 185.

June 3rd, 2020

Face-specific shortcuts in the general-purpose feedforward visual processing hierarchy

Jacob Martin

Abstract: Humans make some of their fastest reaction times when targeting faces with their gaze.  Might there be special shortcut circuits for detecting faces in the general-purpose feedforward visual processing hierarchy?  In this talk, I will present evidence for this hypothesis from several human studies on face detection that used a variety of tools such as panoramic screens, ECoG, SEEG, EEG, eye tracking, and computational modeling.  While humans usually make no more than 3-4 saccades a second during visual search, we recently reported that humans could continuously target small 3 degree faces that were blended into large scenes at speeds up to an average rate of 5.4 faces+scenes each second.  Each time the participant targeted a face in the scene with their gaze, it was erased, and a new face was presented at a different location in a new scene.  We found that upright faces were detected the fastest, with face inversion effects in average saccade reaction times, average targeting rates, average block completion times, and average accuracies from 4° to 16° eccentricities.  Incredibly, participants also often launched small task-related microsaccades within 15ms after each saccade that had the same classical facial landmark landing distributions according to whether the faces were upright (eyes) or inverted (mouth).  In addition, using data that we collected on a large 180° panoramic screen, I will present the surprising fact that humans can make accurate eye movements towards small 2° faces that were pasted into large cluttered images at ±80° eccentricities.  These results are surprising because the visual cortex has been shown to contain hierarchical subdivisions of cells that respond preferentially to particularly sized stimuli at particular locations in the visual field.  For example, in order to evoke the most neural activity from corresponding cells in V1, V2, or V4, stimuli that are presented at greater than 50° eccentricities should have a size of at least 7° of visual angle or more.  Yet, the humans in our study were capable of finding 2° faces with  above-chance accuracy all the way out to eccentricities of ±80°.  Taken together, the results support the possibility of the existence of special shortcuts in the visual hierarchy for human faces.


Link to Zoom Recording of Jake's Talk

Suggested readings:

Zapping 500 faces in less than 100 seconds: Evidence for extremely
fast and sustained continuous visual search

A crash in visual processing: Interference between feedforward and
feedback of successive targets limits detection and categorization

Microsaccades during high speed continuous visual search


High Resolution Human Eye Tracking During Continuous Visual Search


Fast Saccades Toward Faces: Face Detection in Just 100ms
Sébastien M Crouzet, Holle Kirchner, Simon J Thorpe

bottom of page