The Human Blueprint for Teaching Machines to Recognize Faces

Published on

December 12, 2023

Authors

Dr. Catarina Carvalho

Senior Data Scientist, Deeper Insights

Advancements in AI Newsletter

Subscribe to our Weekly Advances in AI newsletter now and get exclusive insights, updates and analysis delivered straight to your inbox.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

In a world where technology and psychology often seem to operate in parallel universes, the field of facial recognition serves as a fascinating bridge between the two. This complex ability, deeply ingrained in our neurological and psychological systems, is essential for our social and emotional lives. When this ability is compromised, as in the case of 'face blindness' or prosopagnosia, it raises intriguing questions. Our growing understanding of the neurological and psychological mechanisms behind human facial recognition is shaping how we develop artificial intelligence to perform similar tasks as well as how we interpret AI’s results. Convolutional Neural Networks (CNNs), for instance, are being fine-tuned based on these human insights. This not only enriches our grasp of this crucial human skill but also paves the way for AI and computers to perceive the world in increasingly human-like ways.

‍

A bridge from psychology to technology

Recognising faces is a big deal in our everyday lives. It's what helps us handle different emotions, behaviours, and ways of thinking in people we meet. Think about how you act when you bump into your boss outside of work versus hanging out with a good friend – that's mostly thanks to face recognition, in sight people.

In simple terms, face recognition is like a key player in our social interactions, enabling us to connect, grasp emotions, and respond appropriately.

‍

Exploring the Brain's Role in Facial Identification

Recent neuroscience research (Behrmann & Mark, 2018) has pinpointed the ventral cortex, a brain region dedicated to processing visual information, as the primary player in object recognition. This portion of the brain is particularly sensitive to object shape, with some areas displaying remarkable resistance to changes in viewpoint and retinal size.

Such recent research has also identified different regions of the brain to be activated when looking at different objects. For instance, faces activate the fusiform face area, while houses and scenes engage the parahippocampal area and the transverse occipital sulcus and letter strings and words activate the visual word form area.

In the case of recognising faces, there is even an additional complex interplay of different brain regions, as a face needs to be related to the person's identity and its corresponding emotional component.

‍

Prosopagnosia – A Condition Worth Noting

As mentioned earlier, the ability to recognise faces is of utmost importance to the sighted population. However, some individuals may encounter a condition known as Prosopagnosia (Greek: prosopon = “face,” agnosia = “not knowing”), or "face blindness," (Benton & Van Allen, 1968). People with prosopagnosia can describe facial features such as age, emotional expressions, and gender without difficulty, but they struggle to recognize familiar faces, including their own. To compensate, individuals with face blindness often rely on additional cues like voice, gait, clothing, or hairstyle to link a face with a name.

Recognising a face involves intricate processes, primarily relying on two key mechanisms: configural processing and featural processing. While configural processing entails assembling various facial elements into a unified, higher-level representation, featural processing deals with recognising specific facial features.

In healthy individuals, these mechanisms harmoniously combine to enable facial recognition. However, the scenario is different for individuals with prosopagnosia, where the intricate workings behind facial recognition remain shrouded in mystery. One hypothesis suggests that these individuals struggle predominantly with the configural processing aspect. While featural processing still functions, the ability to piece together facial elements into a coherent whole seems to be compromised.

‍

The Psychological Underpinnings of Recognising Faces

In addition to examining prosopagnosia from a neurological standpoint, we can also approach it from a psychological angle, as illustrated in a case study by (Wegrzyn, Garlichs, Heß, Woermann, & Labudda, 2019). In this study, the individual being studied articulates a notion that might prove challenging for most sighted individuals to grasp:

“It simply never occurred to me that one could recognise people only by their face”.

This interesting study corroborated some of the neurological findings, namely, prosopagnosia presents impaired recognition of identity for both unfamiliar and familiar faces, but no difficulties recognising expressions, intentions or attractiveness from faces. When investigating the contrast between featural and configural processing through image manipulation using low-pass and high-pass filters to emphasise different facial aspects, the study subject demonstrated performance on par with individuals without prosopagnosia, suggesting the necessity for additional research in this domain.

‍

How Artificial Intelligence is Learning to Recognise Faces

After a discussion of face recognition (or face blindness) from both a neurological and psychological perspective, it is interesting to try and build bridges with artificial intelligence technology.

From a technical perspective, in a computer vision scenario, face recognition can be considered an image detection and classification problem as it entails, initially the detection of a person’s face and, later, the association of the detected face with a person’s name.

Convolutional Neural Networks (CNNs) are pivotal tools for face classification due to their innate capability to autonomously extract pertinent features. By engaging in a sequence of convolutional and pooling layers, they excel at capturing spatial patterns and feature hierarchies. In a hierarchical fashion, these networks acquire a nuanced understanding of distinctive facial attributes, encompassing elements like edges, textures, and intricate shapes.

As this description shows, similarly to what is believed to happen in the ventral cortex, CNNs have intrinsic capabilities to explore both configural components (low-level features) as well as featural components (high-level features) - as presented in figure below.

*Picture derived from (Zeiler & Fergus, 2014)

A further connection can be drawn between the established invariance to rotation, positioning, and scaling in object recognition within the ventral cortex and the corresponding attributes observed in Convolutional Neural Networks (CNNs), as illustrated in the figure below.

‍

*Picture adapted from https://medium.com/analytics-vidhya/cnn-convolutional-neural-network-8d0a292b4498

‍

Furthermore, it is intriguing to draw a parallel between the specialisation of distinct brain regions responsible for face recognition and the training process of Convolutional Neural Networks. The proficiency of CNNs often reaches its peak when they undergo training with extensive datasets. Effective CNNs must possess the capability to analyse both the overall configuration and specific facial features, going beyond the usual attributes associated with faces. Additionally, they must establish intricate connections between these configurational and featural components within the network to accurately identify faces.

Much like the diverse specialisation of different brain areas and subareas in recognising faces, objects, or scenes, it logically follows that CNN models tailored for facial recognition would outperform those trained for more generalised tasks encompassing a wide range of contexts, including faces and common objects.

‍

Final thoughts

Facial recognition is not just a technological marvel. It is a complex interplay of neurological, psychological, and computational processes. From the ventral cortex lighting up in fMRI scans to the struggles of individuals with prosopagnosia, the ability to recognize faces is a multi-faceted phenomenon that still holds many mysteries. Artificial intelligence, particularly CNNs, offers a glimpse into how we might bridge the gap between human cognition and machine learning. As we've seen, the hierarchical nature of CNNs and their ability to process both configurational and featural aspects of faces make them a promising tool for advancing facial recognition technology. However, it's essential to remember that while machines may mimic human abilities, the emotional and social nuances involved in face recognition are uniquely human. As we continue to explore this fascinating subject, one thing is clear: the journey to fully understanding facial recognition is a road that stretches across the landscapes of psychology, neuroscience, and artificial intelligence.

‍

Research

Benton, A., & Van Allen, M. (1968). Impairment in facial recognition in patients with cerebral disease. Cortex, 4, 344–58.

Wegrzyn, M., Garlichs, A., Heß, R. W., Woermann, F. G., & Labudda, K. (2019). The hidden identity of faces: a case of lifelong prosopagnosia. BMC Psychology 7.

Behrmann, M., & Mark, V. (2018). Visual Object Recognition. In STEVENS’ HANDBOOK OF EXPERIMENTAL PSYCHOLOGY AND COGNITIVE NEUROSCIENCE (pp. 491-520). New York: John Wiley & Sons, Inc.

Zeiler, M. D., & Fergus, R. (2014). Visualizing and Understanding Convolutional Networks. ECCV 2014 (pp. 818–833). Switzerland: LNCS 8689.

‍