Swifter: Improved Online Video Scrubbing

Justin Matejka, Tovi Grossman, George Fitzmaurice

January 2013 · Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI)

Abstract

Online streaming video systems have become extremely popular, yet navigating to target scenes of interest can be a challenge. While recent techniques have been introduced to enable real-time seeking, they break down for large videos, where scrubbing the timeline causes video frames to skip and flash too quickly to be comprehendible. We present Swifter, a new video scrubbing technique that displays a grid of pre-cached thumbnails during scrubbing actions. In a series of studies, we first investigate possible design variations of the Swifter technique, and the impact of those variations on its performance. Guided by these results we compare an implementation of Swifter to the previously published Swift technique, in addition to the approaches utilized by YouTube and Netfilx. Our results show that Swifter significantly outperforms each of these techniques in a scene locating task, by a factor of up to 48%.

DOI PDF

Figures

Figure 1. Scrubbing behavior of a traditional streaming video player, the Swift interface [16], and our new Swifter interface, which shows multiple frames around the active timeline location and allows for direct selection of each frame.

Figure 2. Screenshots of the YouTube (A) and Netflix (B) browser based video players.

Figure 3. Frames shown (dark lines) and skipped while moving over a 854 pixel wide timeline in 2 seconds with an update rate of 30fps.

Figure 4. Frames shown (dark lines) and skipped while moving over a 854 pixel wide timeline in 2 seconds with an update rate of 30fps with a YouTube-style controller.

Figure 5. Mechanics of selecting a thumbnail using either the direct or indirect selection method. The cell with the thick outline indicates the currently selected thumbnail.

Figure 6. Position of visible thumbnails as the playhead location updates in each of the scrolling techniques.

Figure /. Llransitioning trom indirect to direct modes of frame selection.

Figure 6. Krames displayed while moving Over a 654 pixel wide timeline in 2 seconds with an update rate of 30fps with a Swifter grid of 5x5.

Figure 9. Thumbnail images from the movie The Intouchables, used for studies one and two. The low and high discernibility scenes are highlighted.

Figure 10. Scrolling techniques performance. (Note: error bars in all graphs report standard error)

Figure 11. Subjective preference results from Study One.

Figure 12. Selection technique usage based on scene length.

Figure 13. The grid dimension conditions in Study Two.

Figure 14. Task performance for High and Low Discernibility scenes with various grid sizes. The dashed lines represent the quadratic best-fit curves.

Figure 15. Subjective preference results for the grid dimensions for each of the discernibility conditions.

Figure 16. Pictographic representation of each of the techniques. ‘F’ represents the active frame, while the red and blue labels indicate the offset of frames before and after the selection.

Figure 17. Thumbnail images from The Adjustment Bureau, used in the third study. The high and low discernibility scenes are highlighted.

Figure 18. Completion time results from the third study.

Figure 19. Overall subjective feedback on the techniques used in Study Three.

Figure 20. Results for the question “Which technique did you like the best” for each discernibility/scene length pair.

BibTeX

@inproceedings{10.1145/2470654.2466149,
 abstract = {Online streaming video systems have become extremely popular, yet navigating to target scenes of interest can be a challenge. While recent techniques have been introduced to enable real-time seeking, they break down for large videos, where scrubbing the timeline causes video frames to skip and flash too quickly to be comprehendible. We present Swifter, a new video scrubbing technique that displays a grid of pre-cached thumbnails during scrubbing actions. In a series of studies, we first investigate possible design variations of the Swifter technique, and the impact of those variations on its performance. Guided by these results we compare an implementation of Swifter to the previously published Swift technique, in addition to the approaches utilized by YouTube and Netfilx. Our results show that Swifter significantly outperforms each of these techniques in a scene locating task, by a factor of up to 48%.},
 address = {New York, NY, USA},
 author = {Matejka, Justin and Grossman, Tovi and Fitzmaurice, George},
 booktitle = {Proceedings of the SIGCHI Conference on Human Factors in Computing Systems},
 doi = {10.1145/2470654.2466149},
 isbn = {9781450318990},
 keywords = {video navigation, video, online streaming},
 location = {Paris, France},
 numpages = {10},
 pages = {1159–1168},
 publisher = {Association for Computing Machinery},
 series = {CHI '13},
 title = {Swifter: Improved Online Video Scrubbing},
 url = {https://doi.org/10.1145/2470654.2466149},
 year = {2013}
}