Type a person’s name into the Google image search engine and the results are likely to vary wildly. You may find pictures of the person you’re seeking, but you’re also likely to see completely irrelevant images just because their name appears on the same web page.
You might have better luck if your computer could analyze a picture of the person you want, then search through millions of other images — even hours of videotape — to find someone who looks identical or similar. Ideally, the computer could match the faces regardless of whether the subject is in bright or low light, is only partially facing the camera or is near or far.
That’s exactly what two Pacific Northwest National Laboratory (PNNL) researchers have done. Their algorithms analyze millions of video frames, pluck out the faces and quantify them to create searchable databases for facial identification.
“We’re measuring the information content of a face much like Google” analyzes written web material, says Harold Trease, a PNNL computational physicist. “What they do for text searching we’re trying to do for video and image processing.”
A program that picks faces out of streaming or recorded video and identifies them regardless of conditions could be useful in many areas, but for Trease and Rob Farber, a PNNL senior research scientist, it’s just a test case.
“It doesn’t have to be webcams,” Farber adds. “This is ‘a first toe in the water’ work” to prove the concept on massive amounts of unstructured data and high-performance computers. The algorithms could be generalized to work with almost any set of digital images to identify a variety of objects, including hidden roadside bombs and tumors.
Face recognition is especially tricky conditions in which light levels, size and angles change constantly. For instance, humans typically have few problems recognizing people regardless of whether they’re close or somewhat distant, but computers aren’t as adept. So facial recognition algorithms must have “scale invariance” — the ability to pick a face out of video regardless of its distance from the camera.
Likewise, a successful algorithm must have a degree of “rotation invariance” — the ability to distinguish faces that aren’t facing the camera head-on. And it must have “translational invariance” — the ability to extract faces or other target objects in a video even if they’re moving within the frame.
The first part of the algorithm, largely Trease’s work, starts with a raw red-green-blue (RGB) format video frame and transforms it to concentrate on the qualities of hue, saturation and ntensity. The intensity parameter is discarded, allowing the algorithm to work regardless of lighting in the image.