Video Surveillance & Physical Security Industry Viewpoints
March 4th, 2020
Author: Tamar Volcani

How Does Artificial Intelligence Technology Recognize Faces?

An Explanation of Facial Recognition Technology

Facial recognition, a biometric technology, is increasingly used to identify people that appear in a photo or video. In simple terms, facial recognition refers to the ability of a software to recognize a person’s identity from an image of its face. Historically, face recognition solutions would map out a human face, comprehend the distance between various points – such as between the eyes, or the shape of the cheeks, and the depth of the eye sockets – and then recognize other faces that match the unique biometric characteristics of the original face. Today, face recognition is powered by Deep Learning (a subset of  artificial intelligence (AI)) techniques, whereby a network is trained through exposure to tagged data. Imitating the way a human is taught and learns, a network is trained – i.e. able to detect, identify and classify data – as it is exposed to more data over time. Face recognition based on Deep Learning powers software to extract unique identity features from a face image input and match them to a bank of reference features, in order to determine the identity of the individual in query. While the exact inner-workings of machine-developed algorithms aren’t entirely transparent, significant deep learning research and hardware development have enabled facial recognition deep learning algorithms to reach superhuman performance, identifying face matches in scenarios where humans could not.

On-Demand Webinar: Setting the Benchmark for Intelligent Video Surveillance: BriefCam Guided Product Tour

Matching Faces with Images

For facial recognition, video intelligence software uses Deep Learning to detect faces in live or recorded video and compare them against a database or watchlist of extracted facial features of identified persons of interest. It is used not only to detect and alert to potential threats or validate the entry of authorized individuals; but also accelerate searching for missing persons, and accelerating post-event law enforcement or physical security investigations.

When a face match occurs for a person of interest entering a facility, the video content analysis system can trigger a real-time alert to security staff to increase their situational awareness and empower them to quickly consider how best to respond – whether it is monitoring more closely, approaching the individual or even apprehending him or her. This scenario is an example of “in the wild” face matching. There are two main types of facial recognition: those for cooperative access control scenarios and those for non-cooperative, “in the wild” surveillance scenarios. Cooperative access control refers to controlled settings, where a single face is compared by the facial recognition system to a photo of the person from a passport, license, or other identification card. This type of recognition is called 1:1 matching or “verification,” because the acquired face is being matched to a single predetermined reference image. An example of this is a face-scanner for granting access to a secure area, such as an airport security gate.

“In the wild” facial recognition involves use of CCTV video cameras that monitor an area. Often, “in the wild” facial recognition will have no previous identifying data about the person to be recognized (i.e., no ID card or passport with a photo). In such cases, the face is identified and extracted from recorded footage or video evidence – or an external source – and added to a watchlist for future detection. The facial recognition system must try to match each detected face against an entire watchlist (or a large subset of it) to detect the specific persons of interest. This is referred to as 1:N matching (with N being the size of the watchlist that is under comparison) or “identification.”

The Advantages of Video Content Analytics “In the Wild”

Unfortunately, “in the wild” environment sometimes lack the optimal lighting and camera positioning or video resolution – or the subject in the video simply may not be looking directly at the camera – to ensure a high level of face matching accuracy. These factors make “in the wild” face recognition more challenging, which is one reason why comprehensive video content analytics software offers an advantage over point solutions for facial recognition. AI-based video content analytics technology detects, identifies, extracts, and catalogs objects in video footage based on classes and attributes, such as gender, appearance similarity, color, size, and direction of movement. This makes video searchable, actionable and quantifiable based on a broad set of data filters and combinations. For example, users may search video footage for “women with red hair, wearing a black jacket, walking east,” in scenarios where identifying the specific woman of interest is not possible. Advanced video analytics software also enables users to configure real-time alerts so that security or law enforcement staff can be notified when someone in a video camera feed matches a description. Similarly, when investigators lack a clear image of a missing person, video content analysis can help locate him or her based on various attributes.

As facial recognition technology continues to develop and become more accurate and robust, its adoption will increase to scale. With higher resolution video cameras and more sophisticated artificial intelligence solutions entering the video surveillance market, consumers can expect to see more facial recognition applications in law enforcement, cities, banks, airports, and retail environments in coming years.