For this project I took on the role of AI engineer. This was the first project with some form of machine learning, and it took me a while to understand the basic steps needed for facial recognition. After I got a poorly working prototype, I spent time comparing various algorithms for the detection and feature extraction phases. We settled on Histograms of Oriented Gradients (HOG) for detecting faces, because it is fast. Another benefit of HOG was that it didn’t detect faces viewed from the side. We only wanted front-face detection to ensure that face embeddings were always high quality. We compared different face embedding models (the way to extract usable features from the face), and found that Google's facenet had the best balance between speed and accuracy. Another very important step was matching the face embeddings to an entry in a database. We needed to not only find the closest embedding, but also determine if a new sample was part of the training set. This was important because if people had not been to the Sogeti office before, they still had to fill in their personal information. The name for this problem is an “Open Set” problem, and we needed to research techniques that could solve it.
Open Set Solutions
Finding a solution to the open set problem was the most interesting part of the project for me. It started off with finding the correct term for this problem, and armed with this information, I could start reading papers on the subject. There were a bunch of solutions, but most of them relied on deep learning. I did not have enough experience yet to implement deep learning algorithms from a paper. The raspberry pi also lacks computing power, so the preferred solution would be lightweight. Other than deep learning there were still a couple good options: set a threshold on the distance, SVM with prediction of the probability, and a variation on k-nearest neighbor (Nearest Neighbors Distance Ratio). I had experience with implementing k-nearest neighbors, so this was manageable for me. I made implementations for all solutions that could be tested on two datasets; one that we created from scratch, and one that was based on Youtube data. The custom dataset was inline with the actual use case, but was very small. The Youtube dataset was not as high quality, but allowed us to test on a much larger volume of faces. From these tests I learned that a combination between the Nearest Neighbors Distance Ratio and the threshold on the distance gave the best result in terms of accuracy and computation time.