What is facial recognition? Applications and how it works
Within the field of computer vision, facial recognition is an area of research and development that deals with giving machines the ability to recognize and verify human faces. Researchers primarily work on creating face recognition technology that can improve businesses and better human lives.
To help strengthen your understanding of the technology, this guide will explain what facial recognition is, how it works, its various applications and how accurate it is today.
What is facial recognition?
Facial recognition software has countless applications in consumer markets, as well as the security and surveillance industries.
There are two main tasks that facial recognition models perform. The first is verification, which is the task of comparing a new input face to a known identity. A good example of this is the unlocking of smartphones with facial identification. When setting up the system, the phone will register your face as the phone’s owner. Therefore, the only task when unlocking is to compare new input faces to your registered face on the device.
The second is recognition, which is the task of comparing an input face to a database of multiple face identities. This task is often used for security and surveillance systems. A good example is facial recognition in law enforcement. On the INTERPOL website, there is a forensics section that explains how they use facial recognition to identify persons of interest at airports and border crossings.
How does face recognition work?
With a lot of interest in the field, data scientists develop new approaches to facial recognition every year. This section will briefly discuss the basics of how facial recognition models work and the main differences between two methods of creating facial embeddings.
At the very basic level, facial recognition models follow these steps:
- An input image is fed to the algorithm.
- The algorithm creates a facial embedding for the input image.
- The algorithm compares the input image’s facial embedding to the embeddings of known faces in the database.
Training
Every approach has a different training method and researchers often adjust or add elements to established methods in the field. However, most systems employ triplet loss for the training of the algorithm. In regards to facial recognition, triplet loss works by feeding the algorithm three images (see below).
Figure 1: Triplet loss
Two of the images are of person A and the remaining image is of person B. The algorithm creates a facial embedding of each image and then compares them.
After the comparison, the network will be adjusted slightly so that embeddings of person A are more similar to each other than they are to the embedding of person B. Subsequently, this teaches the algorithm to use facial measurements that allow it to accurately classify images of the same person as being similar to each other. This process is then repeated hundreds of thousands or even millions of times. Finally, the network should then be able to produce accurate facial embeddings for faces it has never seen before.
Face embeddings
Before the algorithm can compare faces, they must be converted into data that the algorithm can understand. To do this, the system calculates measurements based on facial features and landmarks. Fig. 2 is a visualization of 68 facial landmarks, also known as facial keypoints.
Figure 2: Face embeddings, image via Adrian Rosebrock at PyImageSearch.
Depending on the neural network, these landmarks may or may not be used in creating the embedding. Sometimes the landmarks are just used to crop the face image and remove noise in the background of the image.
Deterministic face embedding
Essentially, by creating face embeddings you are converting a face image into numerical data. That data is then represented as a vector in a latent semantic space. The closer the embeddings are to each other in the latent space, the more likely they are of the same person.
However, the accuracy of models that use deterministic face embeddings depends on the clarity of the input images. Furthermore, such models are often tested under constrained settings. In practice, input images (from surveillance video, for example) are often taken in unconstrained or uncontrolled settings. The image quality may be low or portions of the face may be covered in the image. In such cases, approaches that use deterministic face embeddings suffer in performance.
Probabilistic face embedding
In a paper published by Yichun Shi and Anil K. Jain, there was a proposal for an alternative approach to facial recognition — PFEs. Probabilistic face embeddings represent face images as a distribution rather than a point estimation.
Figure 3: Probabilistic face embedding, image via Shi and Jain.
The image above provides a good visualization of the difference between deterministic and probabilistic embeddings. On the left, the facial embedding in the latent space is represented as a point estimation with no indicators to the uncertainty of the embeddings. Whereas, with PFEs, the mean of the distribution represents the best estimation of the facial features in the image. The variance in the distribution represents the uncertainty.
Of course, with blurry images where the facial features are missing or ambiguous, there will be a higher degree of uncertainty.
How accurate is facial recognition?
We’ve discussed what facial recognition is and how it works. However, perhaps the more important questions are: Can we rely on it? How accurate is the facial recognition tech that we have today?
There is likely no single concrete answer to those questions. Obviously, facial recognition accuracy varies depending on the algorithm and the conditions we test it in. A 99.63% accuracy may sound pretty high. However, acceptable accuracy levels are all a matter of scale and context. For example, when we look at that number in the context of airport security, it’s difficult to decide on an acceptable margin of error.
The Hartsfield–Jackson Atlanta International Airport receives over 260,000 passengers daily. At such a scale, most security officials would not accept even a 0.37% error rate. However, researchers are constantly exploring new approaches to facial recognition. As computing power increases, facial recognition accuracy should continue to improve as well.
What is facial recognition used for?
The applications of the technology are vast and varied. The following examples are just a few of the most interesting ways many businesses use facial recognition today.
Augmented reality
Numerous popular smartphone applications rely on facial recognition. Some famous examples would be the face filters on Instagram, Snapchat and LINE. By locating the user’s facial landmarks, aps using extended reality can accurately superimpose image filters over the user’s face in real time.
Cashless payments
While not yet available in most countries, there are numerous merchants that now accept facial recognition payments in China. Moreover, on October 16th, 2019, SnapPay announced their launch of face recognition payment technology in North America.
Security gates
Another application of this technology involves security gates or doors. Whether it be the entrance to your apartment complex, the front lobby of your office, or even train station ticket barriers, facial recognition technology can be used to allow or deny entry. While this technology isn’t that common yet in most countries, many businesses in China seem to be warming to the technology rather quickly.
As you can see, there are numerous beneficial applications of facial recognition. As the accuracy of the models increase, more and more countries will likely adopt the technology into their infrastructure.