CONTENT BASED VIDEO RETRIEVAL BASED ON HDWT AND SPARSE REPRESENTATION

Sajad Mohamadzadeh; Hassan Farsi

doi:10.5566/ias.1346

Authors

Sajad Mohamadzadeh University of Birjand
Hassan Farsi University of Birjand

DOI:

https://doi.org/10.5566/ias.1346

Keywords:

Content based video retrieval (CBVR), hadamard matrix and discrete wavelet transform (HDWT), key frame extraction, shot boundary detection, sparse representation

Abstract

Video retrieval has recently attracted a lot of research attention due to the exponential growth of video datasets and the internet. Content based video retrieval (CBVR) systems are very useful for a wide range of applications with several type of data such as visual, audio and metadata. In this paper, we are only using the visual information from the video. Shot boundary detection, key frame extraction, and video retrieval are three important parts of CBVR systems. In this paper, we have modified and proposed new methods for the three important parts of our CBVR system. Meanwhile, the local and global color, texture, and motion features of the video are extracted as features of key frames. To evaluate the applicability of the proposed technique against various methods, the P(1) metric and the CC_WEB_VIDEO dataset are used. The experimental results show that the proposed method provides better performance and less processing time compared to the other methods.

Author Biographies

Sajad Mohamadzadeh, University of Birjand

End of Shahid Avini Street, Pardis Shokatabad, Department of Electronics and Communications Engineering, University of Birjand, Birjand, Iran
Hassan Farsi, University of Birjand

End of Shahid Avini Street, Pardis Shokatabad, Department of Electronics and Communications Engineering, University of Birjand, Birjand, Iran

References

Adcock J, Girgensohn A, Cooper M, Liu T, Wilcox L, Rieffel E (2004). FXPAL experiments for TRECVID 2004. In Proc. TREC Video Retrieval Evaluation, Gaithersburg, MD (February 17, 2005) Available: http://www-nlpir.nist.gov/projects/tvpubs/tvpapers04/fxpal.pdf

Amir A, Hsu W, Iyengar G, Lin CY, Naphade M, Natsev A, Neti C, Nock HJ, Smith JR, Tseng BL, Wu Y, Zhang D (2003). IBM research TRECVID-2003 video retrieval system. In Proc. TREC Video Retrieval Evaluation, Gaithersburg, MD (June 15, 2004) Available: http://www-nlpir.nist.gov/projects/tvpubs/tvpapers03/ibm.smith.paper.final2.pdf

Araujo A, Chaves J, Angst R, Girod B (2015). Temporal aggregation for large-scale query-by-image video retrieval. Proc. ICIP, Stanford University, CA.

Calic J, Izquierdo E (2002). Efﬁcient key-frame extraction and video analysis. In Proc. Int. Conf. Inf. Technol.: Coding Computer 28–33.

Camara-Chavez G, Precioso F, Cord M., Phillip-Foliguet S, Araujo A. (2007). Shot boundary detection by a hierarchical supervised approach. In Proc. Int. Conf. Syst., Signals Image Processing 197–200.

Cernekova Z, Pitas I, Nikou C (2006). Information theory-based shot cut/fade detection and video summarization. IEEE T CIRC SYST VID 16(1):82–90.

Cernekova Z, Kotropoulos C, Pitas I (2007). Video shot-boundary detection using singular-value decomposition and statistical tests. J ELECTRON IMAGING 16:043012-1–043012-13.

Chung YY, Chin WKJ, Chen X, Shi DY, Choi E, Chen F (2007). Content-based video retrieval system using wavelet transform. WSEAS T CIRC SYST 6:259–65.

Chang Y, Lee DJ, Hong Y, Archibald J (2008). Unsupervised video shot detection using clustering ensemble with a color global scale invariant feature transform descriptor. EURASIP J IMAGE VID 1–10.

Cooke E, Ferguson P, Gaughan G, Gurrin C, Jones G, Borgue HL, Lee H, et al. (2004). TRECVID 2004 experiments in Dublin city university, in Proc. TREC Video Retrieval Eval., Gaithersburg, MD (February 17, 2005) Available:http://wwwnlpir.nist.gov/projects/tvpubs/tvpapers04/dcu.pdf

Cotsaces C, Nikolaidis N, Pitas I (2006). Video shot detection and condensed representation. A review. IEEE SIGNAL PROC MAG 23:28–37.

Damnjanovic U, Izquierdo E, Grzegorzek M (2007). Shot boundary detection using spectral clustering. In Proc. Eur. Signal Process. Conf., Poznan, Poland, 1779–83.

Elad M (2012), Sparse and redundant representations, Springer, New York.

Farsi H, Mohamadzadeh S (2013). Colour and texture feature-based image retrieval by using Hadamard matrix in discrete wavelet transform. IET IMAGE PROCESS 7:212–8.

Ferman AM, Tekalp AM (2003). Two-stage hierarchical video summary extraction to match low-level user browsing preferences. IEEE T MULTIMEDIA 5:244–56.

Gargi U, Kasturi R, Strayer SH (2000). Performance characterization of video-shot-change detection methods. IEEE T CIRC SYST VID 10:1–13.

Grana C, Cucchiara R (2007). Linear transition detection as a uniﬁed shot detection approach. IEEE T CIRC SYST VID 17:483–9.

Guironnet M, Pellerin D, Guyader N, Ladret P (2007). Video summarization based on camera motion and a subjective evaluation method. EURASIP J. Image Video Processing 2007:1–12.

Hauptmann A, Chen MY, Christel M, Huang C, Lin WH, Ng T, Papernick N, Velivelli A, Yang J, Yan R, Yang H, Wactlar HD (2004). Confounded expectations: Informedia at TRECVID 2004. In Proc. TREC Video Retrieval Evaluation, Gaithersburg, MD, (February 17, 2005) Available: http://www-nlpir.nist.gov/projects/tvpubs/tvpapers04/cmu.pdf

Heesch D, Pickering M, Yavlinsky A, Rüger S (2004). Video retrieval within a browsing framework using key frames. In: Proc TREC video. NIST, Gaithersburg.

Hoi CH, Wong LS, Lyu A (2006). Chinese university of Hong Kong at TRECVID 2006: Shot boundary detection and video search. In Proc. TREC Video Retrieval Evaluation, Available: http://wwwnlpir.nist.gov/projects/tvpubs/tv6.papers/chinese_uhk.pdf

Kekre HB, Thepade SD (2009). Using YUV color space to hoist the performance of block truncation coding for image retrieval. Proc. IEEE-IACC”09.

Lew MS, Sebe N, Djeraba C, Jain R (2006). Content-based multimedia information retrieval: State of the art and challenges. ACM T MULTIM COMPUT 2:1–19.

Li Y, Wang R, Huang Z, Shan S, Chen X (2015). Face video retrieval with image query via hashing across Euclidean space and Riemannian manifold. Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on, 4758-67.

Liu T, Zhang X, Feng J, Lo K (2004). Shot reconstruction degree: A novel criterion for key frame selection. PATTERN RECOGN LETT 25:1451–7.

Lu H, Tan YP (2005). An effective post-refinement method for shot boundary detection. IEEE T CIRC SYST VID 15:1407–21.

Lu ZM, Shi Y (2013). Fast Video Shot Boundary Detection Based on SVD and Pattern Matching. IEEE T IMAGE PROCESS 22:5136-45.

Matsumoto K, Naito M, Hoashi K, Sugaya F (2006). SVM-based shot boundary detection with a novel feature. In Proc. IEEE Int. Conf. Multimedia Expo. 1837–40.

Mohamadzadeh S, Farsi H (2014). Image retrieval using color-texture features extracted from Gabor-Walsh wavelet pyramid. Journal of Information Systems and Telecommunication 2:31-40.

Montagna R, Finlayson GD (2012). Padua point interpolation and Lp-Norm minimization in color-based image indexing and retrieval. IET IMAGE PROCESS 6:139-47.

Mukherjee DP, Das SK, Saha S (2007). Key frame estimation in video using randomness measure of feature point pattern. IEEE T CIRC SYST VID 7:612–20.

Narasimha R, Savakis A, Rao RM, De Queiroz R (2003). Key frame extraction using MPEG-7 motion descriptors. In Proc. Asilomar Conf. Signals, Syst. Computer 2:1575–1579.

Porter SV (2004). Video segmentation and indexing using motion estimation. Ph.D. dissertation, Dept. Computer and Science, Univ. Bristol, Bristol, U.K.

Schoeffmann K, Hopfgartner F, Marques O, Boeszoermenyi L, Jose JM (2010). Video browsing interfaces and applications: A review. SPIE Rev. 1(1): 018004.1–35.

Shuping Y, Xinggang L (2005). Key frame extraction using unsupervised clustering based on a statistical model. TSINGHUA SCI TECHNOL 10:l69-173

Smeaton SF, Over P, Doherty AR (2010). Video shot boundary detection: Seven years of TRECVid activity. COMPUT VIS IMAGE UND 114:411–8.

Snoek CGM, Worring M, Koelma DC, Smeulders AWM (2007). A learned lexicon-driven paradigm for interactive video retrieval. IEEE T MULTIMEDIA 9:280–92.

Song XM, Fan GL (2006). Joint key-frame extraction and object segmentation for content-based video analysis. IEEE T CIRC SYST VID 16:904–14.

Truong BT, Venkatesh S (2007). Video abstraction: A systematic review and classiﬁcation. ACM T MULTIM COMPUT 3:1–37.

Wang T, Wu Y, Chen L (2007). An approach to video key-frame extraction based on rough set. In Proc. Int. Conf. Multimedia Ubiquitous Eng.

Weiming, Hu, Nianhua, Xie, Li Li, Xianglin Zeng, Maybank S (2011). A survey on visual content-based video indexing and retrieval. IEEE T SYST MAN CY C 41:11-22

Wolf W. (1996). Key frame selection by motion analysis. In Proc. IEEE Int. Conf. Acoust, Speech and Signal Proc. Atlanta, GA, USA, 2:1228-31.

Wu X, Ngo CW, Hauptmann AG, Tan H (2009). Real-time near-duplicate elimination for web video search with content and context. IEEE T MULTIMEDIA 11:196-207.

Xiong Z, Zhou XS, Tian Q, Rui Y, Huang TS (2006). Semantic retrieval of video review of research on video retrieval in meetings, movies and broadcast news, and sports. IEEE SIGNAL PROC MAG, 23(2): 18–27.

Yang AY, Zhou Z, Ganesh A, Sastry SS, Yi Ma (2012). Fast l1-minimization algorithms for robust face recognition. arXiv:1007.3753v4 [cs.CV].

Yan R, Hauptmann AG (2007). A review of text and image retrieval approaches for broadcast news video. INFORM RETRIEVAL 10:445–84.

Yuan J, Wang H, Xiao L, Zheng W, Li J, Lin F, Zhang B (2007). A formal study of shot boundary detection. IEEE T CIRC SYST VID 17:168–86.

Yu XD, Wang L, Tian Q, Xue P (2004). Multilevel video representation with application to key frame extraction. In Proc. Int. Multimedia Modelling Conf. 117–23.

Zhang XD, Liu TY, Lo KT, Feng J (2003). Dynamic selection and effective compression of key frames for video abstraction. PATTERN RECOGN LETT 24(9):1523–32.