philip lelyveld The world of entertainment technology


Google’s AI learns how actions in videos are connected

image3-1...scientists at Google propose Temporal Cycle-Consistency Learning (TCC), a self-supervised AI training technique that taps “correspondences” between examples of similar sequential processes (like weight-lifting repetitions or baseball pitches) to learn representations well-suited for temporal video understanding. The codebase is available in open source on GitHub.

As the researchers explain, footage that captures certain actions contains key common moments — or correspondences — that exist independent of factors like viewpoint changes, scale, container style, or the speed of the event. TCC attempts to find such correspondences across videos by leveraging cycle-consistency.

...Moreover, they say that it can transfer metadata (like temporal semantic labels, sound, or text) associated with any frame in one video to its matching frame in another video, and that each frame in a given video could be used to retrieve similar frames by looking up the nearest neighbors in the embedding space.

See the full story here:

Comments (0) Trackbacks (0)

Sorry, the comment form is closed at this time.

Trackbacks are disabled.