Paper:
Mining Appearance Models Directly from Compressed Video
Chen, D., Yang, J., Sun, M., Liu, Q.,
The Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD'06), August 20-23, 2006

Abstract:   In this paper, we propose an approach to learn appearance models of moving objects directly from compressed video. Appearance of a moving object is dynamically changing in video due to varying body poses, lighting, and occlusions. Efficiently mining the appearance models of objects is a crucial and challenging technology in supporting content based video coding, clustering, indexing, and retrieval at the object level. The proposed approach learns the appearance models of moving objects in spatial-temporal dimension of video data by taking the advantages of MPEG compressed video. It detects a moving object and recovers the trajectory of each macro-block covered using motion vector in MPEG codec. The appearances are then reconstructed in DCT domain following the object’s trajectory, and modeled as mixture of Gaussian (MoG) using DCT coefficients. We prove that, under certain assumptions, the MoG model learned from DCT domain can achieve pixel-level accuracy after being transformed back to the spatial domain and has better band-selectivity comparing to the MoG model learned in spatial domain. We finally cluster the MoG models to merge the appearance models of the same object together for object level content analysis

Close