"The input to the systems is an image or video 1110 that may or may not contain images of human faces." . . . .