We demonstrate an attentional selection system for processing video streams from remotely operated underwater vehicles (ROVs). The system identifies potentially interesting visual events spanning multiple frames based on low-level spatial properties of salient tokens, which are associated with those events and tracked over time. If video frames contain interesting frames, they are labeled “interesting”, otherwise they are labeled “boring”. By marking the interesting events and omitting boring frames in the output stream, we augment the productivity of human video annotators, or, alternatively, provide input for a subsequent object classification algorithm.
Download Full PDF Version (Non-Commercial Use)