jithin pradeep Cognitive Research Scientist | Artificial general intelligence enthusiast

Review Notes - DeepFace- Closing the Gap to Human-Level Performance in Face Verification


Face Alignment / Frontalization

The idea is to clear of variations within the images/faces so that every face appears to look straight into the camera(“frontalized”).

Aligmnet pipleine

2D alignment


Deep CNN architecture


The CNN receives the frontalized face images(152x152, RGB).

Convolution-pooling-convolution filtering
Locally-connected layers
Fully-connected layers

o The network receives images, each showing a face, and is trained on the SFC as a multi-class classification problem using a GPU-based engine, implementing the standard back-propagation on feed-forward nets by stochastic gradient descent (SGD).

o The net includes more than 120 million parameters which took three days to train for roughly 15 epochs.

Face verification metrics


Network was trained on the Social Face Classification(SFC) dataset. That seems to be a Facebook-internal dataset with 4.4 million faces of 4k people each with 800 to 1200 faces, where the most recent 5% of face images of each identity are left out for testing.


[1] Taigman, Y., Yang, M., Ranzato, M. A., & Wolf, L.(2014). Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1701-1708).

All results and images are directly taken from the reference paper, for the purpose of better understanding.