Glyph-based video visualization on Google Map for surveillance in smart cities
© The Author(s). 2017
Received: 30 May 2016
Accepted: 2 March 2017
Published: 12 April 2017
Video visualization (VV) is considered to be an essential part of multimedia visual analytics. Many challenges have arisen from the enormous video content of cameras which can be solved with the help of data analytics and hence gaining importance. However, the rapid advancement of digital technologies has resulted in an explosion of video data, which stimulates the needs for creating computer graphics and visualization from videos. Particularly, in the paradigm of smart cities, video surveillance as a widely applied technology can generate huge amount of videos from 24/7 surveillance. In this paper, a state of the art algorithm has been proposed for 3D conversion from traffic video content to Google Map. Time-stamped glyph-based visualization is used effectively in outdoor surveillance videos and can be used for event-aware detection. This form of traffic visualization can potentially reduce the data complexity, having holistic view from larger collection of videos. The efficacy of the proposed scheme has been shown by acquiring several unprocessed surveillance videos and by testing our algorithm on them without their pertaining field conditions. Experimental results show that the proposed visualization technique produces promising results and found effective in conveying meaningful information while alleviating the need of searching exhaustively colossal amount of video data.
Intelligent surveillance in smart cities has rapidly progressed in last 10 years and has intended to provide situational awareness and semantic information for proactive and predictive management of smart cities with better understanding of the environmental activity . VV illustrates the joint process of video analysis and subsequent derivation of representative presentation of essence of visual contents . The visualization of videos is gaining more attention [2, 3] because of addressing challenges of data analysis arisen from video cameras contents [4–6]. Over the past decade, VV usefulness for traffic surveillance [7, 8] application has been effectively demonstrated by researchers [9, 10].
VV offers Spatio-temporal summary and overview of large collection of videos, and its abstract representation of meaningful information assists the users in video content [9, 11]. Conversely, conventional techniques of visual representation such as time series plot have difficulties in conveying impressions from large video collection .
In addition, there is a need to present visual contents of videos in compact forms such that user can quickly navigate through different segments of video sequence to locate segment of interest and zoom in to different detail levels [4, 12]. Viewing videos is a time-consuming process, consequently, it is desirable to develop methods for highlighting and extracting interesting features in videos. There are numerous techniques designed for data analysis in images, and a variety of statistical indicators for data processing. On the contrary, there is a lack of effective techniques for conveying complex statistical information spontaneously to a layperson such as a security officer, apart from using line graphs to portray 1D signal levels . Many researchers studied video processing in the context of video surveillance , monitoring vehicles, and monitoring crowds . However, the main problem in automatic video processing is communication of results of video processing to human operator. Since statistical results are not easily comprehensible, whereas sequences of difference images again need sequential viewing [4, 14].
In the field of visualization, Borgo et al.  carried out a comprehensive survey on video visualization. Effectiveness of VV for conveying meaningful information enclosed in video sequences was demonstrated by Daniel et al. . Andrienko et al.  also illustrated the visual analytical approach for visualization of large amount of data. Data was aggregated and clustered to present on map using color-coded arrows. Wang et al.  proposed an approach for situational understanding by combining videos in 3D environment. Romero et al.  used VV for analyzing the human behavior by explored activity visualization in natural settings over a period of time.
Hoummady et al. presented a survey on shortcomings of all sensory devices which used for real-time traffic information collection [23, 24] and proposed a usage of video camera as data collection for traffic management. His proposed approach relied on computational device primarily for automatic vehicle and pedestrian recognition, 2-wheel vehicles etc.
For road traffic visualization, most commonly used approach is to color-code the areas representing roads on map . Ang et al.  proposed an analytical approach for traffic surveillance from multiple cameras. For estimation of vehicle trajectories, features were extracted. Andrienko et al.  also illustrated the visual analytical approach for visualization of large amount of data. Data was aggregated and was clustered to be presented on map using color-coded arrows. In addition, Chen et al.  proposed an approach to improve visualization using volume and flow signatures. Their study revealed that ordinary people can learn to recognize events based on event signatures in static visualization rather than having to view contents of the entire video.
Technology providers and end-users identify that manual process alone is insufficient for searching exhaustively colossal amount of video data and for meeting the need for screening timely. To alleviate these issues, we are trying to project outdoor surveillance camera activity to Google Maps that makes it easier to have holistic and summarized view of videos. Vast amounts of video data render ineffective manual video analysis though current automatic video analytics techniques undergo inadequate performance .
In this paper, a novel VV technique is proposed and tested on numerous traffic surveillance videos to inherit appropriate visual representations for assisting decision-making process. One can observe pattern and level of activities recorded from visualization as it conveys much more spatial information than statistical indicators. Semantic information is extracted from many traffic surveillance videos that are also linked to Google Map for 3D association. Meanwhile, glyph is individually recognizable and offers multifield visualization [29, 30]. Well-developed glyph-based  visualization technique is proposed that can enable effective and efficient visual communication and information encoding.
Morris  method only visualize the traffic information using color-coded scheme which is unable to represent whether the vehicle change the lane or not. Whereas the proposed system visualization method is able to visualize the traffic information as well as vehicle lane change information. Proposed glyph-based visualization is helpful for predictive management of smart cities and have better understanding of the environmental activity.
The paper is structured as follows, Section II introduces the proposed methodology and Section III presents experimental results. Section IV highlights important discussions about results. Conclusion is drawn Section V.
2 Proposed methodology
Top level diagram depicts the flow of proposed system. A real-time video visualization method proposed where visual information is mapped on Google Map to visualize information in efficient and in effective manner. Obtained information is helpful in making intelligent system for smart cities. Each step output is given as input to succeeding step. In the proposed system, problem of traffic visualization is an enormous amount of visual information visualization at the same time. Motion tracking is performed to obtain the semantic information from surveillance videos. Furthermore, individual tracking of vehicle gives the vehicle position in video space coordinates. After that, 3D computation is performed for visual information visualization on Google Map. Furthermore, individual tracking of vehicle gives the vehicle position in video space coordinates. For visualization of semantic information on Google Map, 3D conversion is performed. Time-stamped individual vehicle information is mapped on map, and glyph visualization enables effective encoding of semantic information.
The speed of vehicle is calculated in pixels per second by dividing the distance traveled by the given vehicle over the time taken. The distance is calculated by taking the sum of square difference of centroid values among successive frames, and time is calculated using the frame rate of video. Height of pixels is calculated by dividing the standard distance between central lane marks divided by distance in pixels of central lane marks as shown in Figs. 4 and 5.
As it has been seen, the level of congestion increases with the increase in crash frequency. As congestion of traffic takes place, mean speed of vehicles decreases. Due to the presence of traffic congestion, users are constrained to drive in limited or slow speed. As shown in results, smooth traffic suddenly changes the speed of vehicles as the number of vehicles increases. Figure 7 is of traffic motorway video, which contains 914 frames and has frame rate of 30 fps. From the figure, it is clearly seen that the traffic flow is normal at the start, but there is change in mean speed between frame 220 and 270. And also the number of vehicles increases between these frames.
Object tracking is part of the proposed system which collects temporal and spatial information about the object under consideration from the video sequence. Semantic information such as trajectories of detected objects is obtained from motion tracking which is used as an input for 3D computation and mapping while displaying the results on Google Map. As the coordinates of video space and Google space are different, so in order to perform mapping among the two different spaces, 3D mapping is performed. Time-stamped glyph is generated to represent the semantic information on video and Google space.
2.1 Motion tracking and semantic event display
The importance of motion tracking in videos is undeniable since it is beneficial in many applications. Semantic video analysis [35, 36] is used for extracting only important information notably speed, type of vehicle, and trajectory and lane changing from the video. This information is automatically extracted, which is represented in terms of high level descriptors, indexing, searching, and retrieving the video content. Tracking involves the maintenance of appearance, velocity, and position of each observable object over frame sequences. Object detection is performed by linking each object to the most similar segment in consecutive frames of video.
Velocity of pixel (x, y) is the neighborhood average of other pixels. Optical flow is estimated from the generated motion vectors by storing the flow vectors as complex numbers. When optical flow is constructed, flow vectors magnitudes are obtained.
However, trajectories are not of same length even traveling along a same route  because objects move at a different speed. The purpose of using motion vectors for representing information is to maintain the strong relationship of motion information with semantic event. Identification of different events is performed from video by analyzing the motion features.
As object is tracked over different frame sequences results in sequences of inferred tracking states which are represented as f 1, f 2,........... f T
Where f t represents the object velocity [v x, v y], position [x, y], and direction [a x, a y] at time t extracted by tracking the object.
2.2 3D conversion from video to map
Single vehicle location in each frame of video sequence is represented by “plus” signs in Fig. 11 by computing the homography matrix for calculation of video and map space coordinates.
The solution for h is obtained as the eigenvector corresponding to the smallest value of A T A. The video corner points after the computation of H are projected on the correspondence coordinates points on the Google Map. Each pixel position within video dimension is projected onto a map by using H, and its resultant latitude and longitude coordinates are stored. H inverse is also calculated for mapping the map space coordinates on video space.
2.3 Time-stamped semantic glyph representation
Glyph-based visualization is considered to be a common form of visual design in which collection of visual objects are used to represent data set . Glyph technique is used to visualize motion vectors which are overlaid on every frame of the video stream in this approach. Our major concern is visual information collection which appears in all video frames until the object leaves the field of view. Time-stamped glyph is generated to represent the speed, type of car with different colors, and event information, e.g., lane changes information.
2.4 Bezier fitting for glyph generation
That is equivalent to linear interpolation.
Bezier curve is used to smooth the chaotic trajectories of vehicles obtained using motion tracking. As each car moves with different speed, so length of trajectories varies.
2.5 Association between Google Map and video visualization
To properly visualize the analysis of results on Google Maps, the output must be properly aligned to the map coordinates [31, 43]. Rectification of camera image is automatically done and mapped on to a Google Map. We detect the activity in each frame of surveillance video, and the position of moving object on ground plane is learned through trajectory learning, and correspondence points are mapped on to Google Map. In this way, video observation of many surveillance cameras at same time is improved by projecting outdoor surveillance camera activity on to Google Maps.
2.6 Holistic view of video using Google Map
3 Experimental regards and discussion
Different frames of video are captured to display the results of motion tracking. Experimental results are shown by applying our algorithm on videos of different resolutions and frame rates for evaluation. Different locations surrounding the Northumbria University, Newcastle upon Tyne, UK has been taken into account for capturing traffic videos. It comprises of stacking temporally spaced intervals of key video frames, and color-filled glyphs are used for representing semantic information to summarize the motion flow.
3.1 Event-aware visualization using semantic glyph
It has been shown in Fig. 22 that car changes the lane in end of frames which is represented by glyph having red color.
4 Future work
For interpretation of data by user in real-time system, data must be visualized to offer intuitive information which can be used for obtaining patterns and trends. However, inferring statistics from incoming data automatically are computationally expensive. Consequently, Walton et al.  projected the live traffic videos on to maps for displaying traffic information. However, displaying several traffic videos simultaneously was difficult due to heavy transmission load. Human intelligence is used to infer semantic details from videos in visual mapping strategy. Recently, Cheng  presented a system for traffic situation visualization by interpreting statistical vehicle detector data and by composing videos in a database. Traffic flow was estimated from videos and, mapping was built among videos and vehicle detector data. When visualizing traffic situations, they are ineffective to simulate all kinds of dynamics and kinematics due to unknown driving behavior in different regions.
VV is concerned with visual representation of input video for revealing important events and features in the video. So far, it is intended to assist in intelligent reasoning whilst alleviating the burden of viewing videos. A novel glyph-based visualization approach has been developed that can be effectively used for surveillance video. A visual activity analysis is performed based on motion tracking for monitoring live traffic on highways. The proposed solution has been tested on multiple video resolutions and frame rates for visualization of traffic flows. Experimental results show algorithm is credible enough to be deployed in field conditions and enable better utilization of the existing video-based traffic management systems.
This research is supported by the Higher Education Commission of Pakistan, research grant no.: 1-8/HEC/HRD/2015/3719 to the Computer Engineering Department at National University of Sciences and Technology Pakistan and to the Computer Department at the Northumbria University, Newcastle Upon Tyne, UK.
Funding for this research is provided by the National University of Sciences and Technology Pakistan.
The rest of authors are all my supervisors in NUST University Pakistan and Northumbria University, Newcastle Upon Tyne, UK. The work is part of a split PhD program between NUST and Northumbria University, Newcastle Upon Tyne, UK. All members have contributed significantly to the technique or methods used, to the research concept, to the data collection, to the experiment design, and to the critical revision of the article. The work started after the unanimous approval of topic by the supervisors from the both universities. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- HM Dee, SA Velastin, How close are we to solving the problem of automated visual surveillance. Mach. Vis. Appl. 19(5-6), 329–343 (2008)View ArticleGoogle Scholar
- MM Yeung, BL Yeo, Video visualization for compact presentation and fast browsing of pictorial content. IEEE Trans. Circuits Syst. Video Technol. 7(5), 771–785 (1997)View ArticleGoogle Scholar
- G Andrienko, N Andrienko, A visual analytics approach to exploration of large amounts of movement data, in International Conference on Advances in Visual Information Systems (Springer, Berlin, 2008), pp. 1–4Google Scholar
- G. Daniel, M. Chen, in Proceedings of the 14th IEEE Visualization 2003 (VIS'03). Video visualization, (IEEE Computer Society, 2003), p. 54Google Scholar
- A. Cavallaro & T. Ebrahimi, in IEEE International Symposium on Circuits and Systems. Change detection based on color edges, (IEEE; 1999, 2001), No. 2, pp. 141-144Google Scholar
- R. T. Collins, A. J. Lipton, T. Kanade, H. Fujiyoshi, D. Duggins, Y. Tsin, … & L. Wixson, in A system for video surveillance and monitoring. (Technical Report CMU-RI-TR-00-12, Robotics Institute, Carnegie Mellon University, 2000), p. 1-6Google Scholar
- P. Dollár, V. Rabaud, G. Cottrell & S. Belongie, in 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance. Behavior recognition via sparse spatio-temporal features, (IEEE, 2005), p. 65-72Google Scholar
- A. Girgensohn, D. Kimber, J. Vaughan, T. Yang, F. Shipman, T. Turner, … & T. Dunnigan, in Proceedings of the 15th ACM international conference on Multimedia. DOTS: support for effective video surveillance, (ACM, 2007), p. 423-432Google Scholar
- B. Duffy, J. Thiyagalingam, S. Walton, D. J. Smith, A. Trefethen, J. C. Kirkman-Brown, … & M. Chen, Glyph-based video visualization for semen analysis. IEEE Trans. Vis. Comput. Graph. 21(8), 980-993 (2015).Google Scholar
- P Kumar, S Ranganath, H Weimin, K Sengupta, Framework for real-time behavior interpretation from traffic video. IEEE Trans. Intell. Transp. Syst. 6(1), 43–53 (2005)View ArticleGoogle Scholar
- M Höferlin, K Kurzhals, B Höferlin, G Heidemann, D Weiskopf, Evaluation of fast-forward video visualization. IEEE Trans. Vis. Comput. Graph. 18(12), 2095–2103 (2012)View ArticleGoogle Scholar
- C. Xu, J. Liu & B. Kuipers, in Computer and Robot Vision (CRV), 2011 Canadian Conference on. Motion segmentation by learning homography matrices from motor signals, (IEEE, 2011), p. 316-323Google Scholar
- C. Yan, Y. Zhang, F. Dai & L. Li, in Data Compression Conference (DCC), 2013. Highly parallel framework for HEVC motion estimation on many-core platform, (IEEE, 2013), p. 63-72Google Scholar
- C Yan, Y Zhang, F Dai, J Zhang, L Li, Q Dai, Efficient parallel HEVC intra-prediction on many-core processor. Electron. Lett. 50(11), 805–806 (2014)View ArticleGoogle Scholar
- CC Loy, Activity understanding and unusual event detection in surveillance videos (Doctoral dissertation), 2010Google Scholar
- C Yan, Y Zhang, F Dai, X Wang, L Li, Q Dai, Parallel deblocking filter for HEVC on many-core processor. Electron. Lett. 50(5), 367–368 (2014)View ArticleGoogle Scholar
- DB Goldgof, D Sapper, J Candamo, M Shreve, Evaluation of Smart Video for Transit Event Detection, 2009. No. Report No. 2117-7807-00Google Scholar
- C Yan, Y Zhang, J Xu, F Dai, L Li, Q Dai, F Wu, A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal Process Lett. 21(5), 573–576 (2014)View ArticleGoogle Scholar
- R Borgo, M Chen, B Daubney, E Grundy, G Heidemann, B Höferlin, X Xie, A survey on video-based graphics and video visualization, in Eurographics (STARs), 2011, pp. 1–23Google Scholar
- M. Chen, R. Botchen, R. Hashim, D. Weiskopf, T. Ertl & I. Thornton, Visual signatures in video visualization. IEEE Trans. Vis. Comput. Graph. 12(5), (2006)Google Scholar
- Y Wang, DM Krum, EM Coelho, DA Bowman, Contextualized videos: Combining videos with environment models to support situational understanding. IEEE Trans. Vis. Comput. Graph. 13(6), 1568–1575 (2007)View ArticleGoogle Scholar
- M Romero, J Summet, J Stasko, G Abowd, Viz-A-Vis: Toward visualizing video through computer vision. IEEE Trans. Vis. Comput. Graph. 14(6), 1261–1268 (2008)View ArticleGoogle Scholar
- B Hoummady, U.S. Patent No. 6,366,219 (Patent and Trademark Office, Washington, DC, 2002)Google Scholar
- C Yan, Y Zhang, J Xu, F Dai, J Zhang, Q Dai, F Wu, Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans. Circuits Syst. Video Technol. 24(12), 2077–2089 (2014)View ArticleGoogle Scholar
- S. Shekhar, C. T. Lu, R. Liu & C. Zhou, in Intelligent Transportation Systems, 2002. Proceedings. The IEEE 5th International Conference on. CubeView: a system for traffic data visualization, (IEEE, 2002), p. 674-678Google Scholar
- D. Ang, Y. Shen & P. Duraisamy, in Proceedings of the Second International Workshop on Computational Transportation Science. Video analytics for multi-camera traffic surveillance, (ACM, 2009), p. 25-30Google Scholar
- F. Jiang, Y. Wu & A. K. Katsaggelos, in 2007 IEEE International Conference on Image Processing. Abnormal event detection from surveillance video by dynamic hierarchical clustering, vol. 5 (IEEE, 2007), p. V-145Google Scholar
- RD Dony, JW Mateer, JA Robinson, Techniques for automated reverse storyboarding. IEE Proc. Vis. Image Signal Process. 152(4), 425–436 (2005)View ArticleGoogle Scholar
- J. Fuchs, F. Fischer, F. Mansmann, E. Bertini, & P. Isenberg, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Evaluation of alternative glyph designs for time series data in a small multiple setting, (ACM, 2013), p. 3237-3246)Google Scholar
- R. Borgo, J. Kehrer, D. H. Chung, E. Maguire, R. S. Laramee, H. Hauser, … & M. Chen, in Eurographics (STARs). Glyph-based Visualization: Foundations, Design Guidelines, Techniques and Applications, (2013), p. 39-63Google Scholar
- MO Ward, Multivariate data glyphs: Principles and practice, in Handbook of data visualization (Springer, Berlin, 2008), pp. 179–198View ArticleGoogle Scholar
- BT Morris, C Tran, G Scora, MM Trivedi, MJ Barth, Real-time video-based traffic measurement and visualization system for energy/emissions. IEEE Trans. Intell. Transp. Syst. 13(4), 1667–1678 (2012)View ArticleGoogle Scholar
- V KR, LM Patnaik, Moving vehicle identification using background registration technique for traffic surveillance, in Proceedings of the International MultiConference of Engineers and Computer Scientists, vol. 1, 2008Google Scholar
- SCS Cheung, C Kamath, Robust background subtraction with foreground validation for urban traffic video. EURASIP J. Adv. Signal Process. 2005(14), 726261 (2005)MATHGoogle Scholar
- S Liu, H Yi, LT Chia, D Rajan, S Chan, Semantic analysis of basketball video using motion information, in Pacific-Rim Conference on Multimedia (Springer, Berlin, 2004), pp. 65–72Google Scholar
- S. Walton, M. Chen & D. Ebert, LiveLayer: Real-time Traffic Video Visualisation on Geographical MapsGoogle Scholar
- BK Horn, BG Schunck, Determining optical flow. Artif. Intell. 17(1-3), 185–203 (1981)View ArticleGoogle Scholar
- S Aslani, H Mahdavi-Nasab, Optical flow based moving object detection and tracking for traffic surveillance. Int. J. Electrical Electron. Commun. Energy Sci. Eng. 7(9), 789–793 (2013)Google Scholar
- CP Kappe, L Schütz, S Gunther, L Hufnagel, S Lemke, H Leitte, Reconstruction and Visualization of Coordinated 3D Cell Migration Based on Optical Flow. IEEE Trans. Vis. Comput. Graph. 22(1), 995–1004 (2016)View ArticleGoogle Scholar
- BT Morris, MM Trivedi, A survey of vision-based trajectory learning and analysis for surveillance. IEEE Trans. Circuits Syst. Video Technol. 18(8), 1114–1127 (2008)View ArticleGoogle Scholar
- D. Beymer, P. McLauchlan, B. Coifman & J. Malik, in Computer Vision and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer Society Conference on. A real-time computer vision system for measuring traffic parameters, (IEEE, 1997), p. 495-501Google Scholar
- J Faraway Julian, P Matthew Reed, W Jing, Modelling three‐dimensional trajectories by using Bézier curves with application to hand motion. J. R. Stat. Soc. Ser. C. Appl. Stat. 56.5(07), 571–585 (2007)View ArticleGoogle Scholar
- B Morris, MM Trivedi, Contextual activity visualization from long-term video observations, in University of California Transportation Center, 2010Google Scholar
- C. T. Lu, A. P. Boedihardjo & J. Zheng, in 22nd International Conference on Data Engineering (ICDE'06). Aitvs: Advanced interactive traffic visualization system, (IEEE, 2006), p. 167-167Google Scholar
- CY Hsieh, YS Wang, Traffic situation visualization based on video composition. Comput. Graph. 54, 1–7 (2016)View ArticleGoogle Scholar
- G Medioni, I Cohen, F Brémond, S Hongeng, R Nevatia, Event detection and analysis from video streams. IEEE Trans. Pattern Anal. Mach. Intell. 23(8), 873–889 (2001)View ArticleGoogle Scholar
- Video visualization. in Proc. IEEE Visualization 2003, Seattle, p. 409–416, 2003. H. Denman, NGoogle Scholar