Indexing of Fictional Video Content for Event Detection and Summarisation

Lehane, Bart; O'Connor, NoelE; Lee, Hyowon; Smeaton, AlanF

doi:10.1155/2007/14615

Research Article
Open access
Published: 03 October 2007

Indexing of Fictional Video Content for Event Detection and Summarisation

Bart Lehane¹,
NoelE O'Connor²,
Hyowon Lee¹ &
…
AlanF Smeaton²

EURASIP Journal on Image and Video Processing volume 2007, Article number: 014615 (2007) Cite this article

1140 Accesses
12 Citations
Metrics details

Abstract

This paper presents an approach to movie video indexing that utilises audiovisual analysis to detect important and meaningful temporal video segments, that we term events. We consider three event classes, corresponding to dialogues, action sequences, and montages, where the latter also includes musical sequences. These three event classes are intuitive for a viewer to understand and recognise whilst accounting for over 90% of the content of most movies. To detect events we leverage traditional filmmaking principles and map these to a set of computable low-level audiovisual features. Finite state machines (FSMs) are used to detect when temporal sequences of specific features occur. A set of heuristics, again inspired by filmmaking conventions, are then applied to the output of multiple FSMs to detect the required events. A movie search system, named MovieBrowser, built upon this approach is also described. The overall approach is evaluated against a ground truth of over twenty-three hours of movie content drawn from various genres and consistently obtains high precision and recall for all event classes. A user experiment designed to evaluate the usefulness of an event-based structure for both searching and browsing movie archives is also described and the results indicate the usefulness of the proposed approach.

[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]

References

Alatan AA, Akansu AN, Wolf W: Multi-modal dialogue scene detection using hidden Markov models for content-based multimedia indexing. Multimedia Tools and Applications 2001,14(2):137-151. 10.1023/A:1011395131992
Article MATH Google Scholar
Bordwell D, Thompson K: Film Art: An Introduction. McGraw-Hill, New York, NY, USA; 1997.
Google Scholar
Browne P, Smeaton AF, Murphy N, O'Connor NE, Marlow S, Berrut C: Evaluating and combining digital video shot boundary detection algorithms. Proceedings of Irish Machine Vision and Image Processing Conference (IMVIP '02), August-September 2002, North Ireland, UK
Google Scholar
Cao Y, Tavanapong W, Kim K, Oh J: Audio-assisted scene segmentation for story browsing. Proceedings of the 2nd International Conference Image and Video Retrieval (CIVR '03), July 2003, Urbana-Champaign, Ill, USA 446-455.
Google Scholar
Chen L, Rizvi SJ, Özsu MT: Incorporating audio cues into dialog and action scene extraction. Storage and Retrieval for Media Databases, January 2003, Santa Clara, Calif, USA, Proceedings of SPIE 5021: 252-263.
Google Scholar
Kender JR, Yeo B-L: Video scene segmentation via continuous video coherence. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '98), June 1998, Santa Barbara, Calif, USA 367-373.
Google Scholar
Lehane B, O'Connor NE, Murphy N: Dialogue sequence detection in movies. Proceedings of the 4th International Conference on Image and Video Retrieval (CIVR '05), July 2005, Singapore 286-296.
Google Scholar
Lehane B, O'Connor NE, Smeaton AF, Lee H: A system for event-based film browsing. The 3rd International Conference on Technologies for Interactive Digital Storytelling and Entertainment (TIDSE '06), December 2006, Darmstadt, Germany 334-345.
Chapter Google Scholar
Leinhart R, Pfeiffer S, Effelsberg W: Scene determination based on video and audio features. Proceedings of the IEEE International Conference on Multimedia Computing and Systems, June 1999, Florence, Italy 1: 685-690.
Article Google Scholar
Li Y, Jay Kou CC: Movie event detection by using audio visual information. Proceedings of the 2nd IEEE Pacific Rim Conference on Advances in Multimedia Information Processing, October 2001, Beijing, China 198-205.
Google Scholar
Li Y, Jay Kou CC: Video Content Analysis Using Multimodal Information. Kluwer Academic Publishers, Dordrecht, The Netherlands; 2003.
Book Google Scholar
Manjunath B, Salember P, Sikora T: Introduction to MPEG-7, Multimedia Content Description Language. John Wiley & Sons, New York, NY, USA; 2002.
Google Scholar
Rasheed Z, Shah M: Scene detection in Hollywood movies and TV shows. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 2: 343-348.
Google Scholar
Rui Y, Huang TS, Mehrotra S: Constructing table-of-content for video. Journal of Multimedia System 1999,7(5):359-368. 10.1007/s005300050138
Article Google Scholar
Sundaram H, Chan S-F: Determining computable scenes in films and their structures using audio-visual memory models. Proceedings of the 8th ACM International Conference on Multimedia, October-November 2000, Los Angeles, Calif, USA 95-104.
Google Scholar
The Internet movie database 2006.http://www.imdb.com/
Yeung M, Yeo B-L: Time constrained clustering for segmentation of video into story units. Proceedings of the 13th International Conference on Pattern Recognition, August 1996, Vienna, Austria 3: 375-380.
Article Google Scholar
Yeung M, Yeo B-L: Video visualisation for compact presentation and fast browsing of pictorial content. IEEE Transactions on Circuits and Systems for Video Technology 1997,7(5):771-785. 10.1109/76.633496
Article Google Scholar
Zhai Y, Rasheed Z, Shah M: A framework for semantic classification of scenes using finite state machines. Proceedings of the International Conference on Image and Video Retrieval (CIVR '04), July 2004, Dublin, Ireland 279-288.
Chapter Google Scholar
Zhai Y, Rasheed Z, Shah M: Semantic classification of movie scenes using finite state machines. IEE Proceedings: Vision, Image and Signal Processing 2005,152(6):896-901. 10.1049/ip-vis:20045178
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Digital Video Processing, Dublin City University, Dublin 9, Ireland
Bart Lehane & Hyowon Lee
Adaptive Information Cluster, Dublin City University, Dublin 9, Ireland
NoelE O'Connor & AlanF Smeaton

Authors

Bart Lehane
View author publications
You can also search for this author in PubMed Google Scholar
NoelE O'Connor
View author publications
You can also search for this author in PubMed Google Scholar
Hyowon Lee
View author publications
You can also search for this author in PubMed Google Scholar
AlanF Smeaton
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bart Lehane.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Lehane, B., O'Connor, N., Lee, H. et al. Indexing of Fictional Video Content for Event Detection and Summarisation. J Image Video Proc 2007, 014615 (2007). https://doi.org/10.1155/2007/14615

Download citation

Received: 30 September 2006
Revised: 22 May 2007
Accepted: 02 August 2007
Published: 03 October 2007
DOI: https://doi.org/10.1155/2007/14615

Indexing of Fictional Video Content for Event Detection and Summarisation

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords