The Banuba facial detection and tracking technology is a unique product for mobile and web app developers. Its high performance is achieved due to:
Other solutions create filters by identifying 2D points on the face first, then applying nonlinear equations to create a 3D model of the head.
We establish a 3D model of the head directly, skipping the identification of 2D points. This makes the solution more accurate. The 3D math model (Face Kernel ™), developed by Banuba, reduces all possible transformations to a limited number of variables.
For data sets, Banuba uses complementary systems. Semi-supervised metric learning and generative adversarial networks (GANs) work together. We use hand-crafted, hand-tuned code alongside compilers, harnessing human ability where necessary.
Our in-house math models cut the execution time on a smartphone, reduce the learning time for the algorithm, and allow for a larger data set improving the quality of the operation. Banuba uses a rather unconventional form of deep learning, mixing CNNs and different variations of Random Forests.
We’ve developed unconventional types of neural network layers, tuned to specific architectures, namely Apple A9, A10, A11 CPUs and Android.
The Banuba AR SDK is supported even by constrained devices, providing effective and fast performance on 90% of smartphones.
This technology detects the face and head-pose movements. Once the face has been detected, the algorithm switches to head-pose tracking mode by using the position of the head on the previous image as the initial approximant. If the face is lost, the algorithm switches back to face detection mode.
Based on a directly inferred 3D model of the head (rather than one transformed from a 2D model), the technology is also operable even with a low SNR (signal to noise ratio) and poor lighting conditions. The incorporated model can forecast the appearance of the head in the subsequent frame, increasing stability and precision.
The size of the data set is 300,000 faces.
Thanks to this technology, it is possible to both “track” a person’s gaze, and control a smartphone’s function with eye movements. An algorithm detects micro-movements of the eye with subpixel accuracy in real-time. Based on that data, a vector of movement is created.
Banuba’s face recognition algorithm helps to measure the distance to various points on a scanned surface with a high degree of precision and to detect its shape. It can detect, for instance, whether the user’s eyes are open or closed.
Face Motion Capture involves scanning facial movements and converting them to computer animation for movies, games, or real-time avatars. It can operate either in the real-time or based on the user’s preliminarily saved data or/and their face-motion models. Derived from the movements of real people, the technology results in more realistic and nuanced computer character animation than if the animation were created manually.
Face segmentation is a specific computer vision task which assigns labels to facial features such as nose, mouth, eye, hair, etc., to each pixel in a face image. Our face segmentation techniques include complex cascaded machine learning algorithms in combination with color model and Monte Carlo approaches. This leads to precise detection of eyes, their structure (iris, pupil, eyeball, etc.) and the nose, ear, cheek, chin, mouth, lip eyebrow and forehead.
Facial anthropometry refers to the measurement of the individual facial features. Our novel algorithm automatically detects a set of anthropometric facial fiducial points that are associated with these features. This makes it possible to recognize refined facial patterns and recreate detailed semantics, mimics and anthropometrics.
Banuba has developed a library for detecting hair and skin color for iOS. The area above the person’s forehead is used for hair color detection as it is less sensitive to head’s turns than other areas of hair. The technology recognises sharp intensity patterns above the hairline and analyzes color. There will be no such sharp intensity patterns for bald individuals, so our technology also allows us to detect the lack of hair.
To detect skin color samples are taken from areas of the face that algorithms point to as specifically being face skin.
This technology is based on convolutional neural networks. The learning sample selection consists images of men and women, categorised by their hairstyle. Learning is the process of matching an image with the most relevant hairstyle sample.
Semi-supervised metrics learning was used for the creation of the data set. Subsequently, a GAN was trained for the expansion of the data set, upon which, the final network was trained. To improve quality, a loss function, specifically selected for the task, was used.
People’s emotions are reflected on their faces, and with the ability to “read a face” it is possible to deliver more personalized content. Our technology allows us to detect six basic emotions; anger, disgust, fear, happiness, sadness, and surprise.
The variables of the model returned by the recognition algorithm can be used either directly, or transformed into parameters for another model, such as Facial Action Coding System (FACS), which detects muscle movements that correspond to specific emotions.
This technology is used for creating visually beautified images of users. Anthropometric data and mimics are analyzed and corrected in real time.
Banuba has developed a library to separate a person’s image from the background for Apple iOS.
This technology can be used for the real-time replacement of backgrounds with both static and animated textures. Backgrounds can be changed into a number of optional presets during video calls or used as an engaging effect in advertising.
The technology is based on convolutional neural networks, with color images as the input and a probability mask showing whether a pixel belongs to the class “person” or to the class “background.” This allows us to ensure high performance and results for real-time backgrounds.
The problem of lack of data sets for separation of objects from the background is resolved by the creation of a small initial data set, which is increased by active learning and subsequent fine-tuning.
A correctly selected data set helps to obtain optimal results and high-quality implementation.