Inspired by analogous work in computerized text-mining, the scientists have created an algorithm that takes account of such factors as the presence of faces, the size of objects and their position in the frame to reduce, say, five hours of video to just 10 or 20 relevant frames. (For now, at least, sound is excluded.) The algorithm also enables the computer to detect whether the camera wearer is static, in transit, or just moving his head. The presence of hands—their proximity to objects, for example—is also used as a cue.
The new technique appears effective, at least compared to earlier methods. In a blind test on 34 volunteers, a majority preferred summaries made using the Texas algorithm over those made by alternative approaches, such as selecting frames at regular intervals.
See the full story here: http://online.wsj.com/article/SB10001424127887323808204579085241433259078.html