Glyph-based video visualization on Google Map for surveillance in smart cities

Mehboob, Fozia; Abbas, Muhammad; Rehman, Saad; Khan, Shoab A.; Jiang, Richard; Bouridane, Ahmed

doi:10.1186/s13640-017-0175-4

Research
Open access
Published: 12 April 2017

Glyph-based video visualization on Google Map for surveillance in smart cities

Fozia Mehboob^1,2,
Muhammad Abbas¹,
Saad Rehman¹,
Shoab A. Khan¹,
Richard Jiang² &
…
Ahmed Bouridane²

EURASIP Journal on Image and Video Processing volume 2017, Article number: 28 (2017) Cite this article

4327 Accesses
9 Citations
Metrics details

Abstract

Video visualization (VV) is considered to be an essential part of multimedia visual analytics. Many challenges have arisen from the enormous video content of cameras which can be solved with the help of data analytics and hence gaining importance. However, the rapid advancement of digital technologies has resulted in an explosion of video data, which stimulates the needs for creating computer graphics and visualization from videos. Particularly, in the paradigm of smart cities, video surveillance as a widely applied technology can generate huge amount of videos from 24/7 surveillance. In this paper, a state of the art algorithm has been proposed for 3D conversion from traffic video content to Google Map. Time-stamped glyph-based visualization is used effectively in outdoor surveillance videos and can be used for event-aware detection. This form of traffic visualization can potentially reduce the data complexity, having holistic view from larger collection of videos. The efficacy of the proposed scheme has been shown by acquiring several unprocessed surveillance videos and by testing our algorithm on them without their pertaining field conditions. Experimental results show that the proposed visualization technique produces promising results and found effective in conveying meaningful information while alleviating the need of searching exhaustively colossal amount of video data.

1 Introduction

Intelligent surveillance in smart cities has rapidly progressed in last 10 years and has intended to provide situational awareness and semantic information for proactive and predictive management of smart cities with better understanding of the environmental activity [1]. VV illustrates the joint process of video analysis and subsequent derivation of representative presentation of essence of visual contents [2]. The visualization of videos is gaining more attention [2, 3] because of addressing challenges of data analysis arisen from video cameras contents [4–6]. Over the past decade, VV usefulness for traffic surveillance [7, 8] application has been effectively demonstrated by researchers [9, 10].

VV offers Spatio-temporal summary and overview of large collection of videos, and its abstract representation of meaningful information assists the users in video content [9, 11]. Conversely, conventional techniques of visual representation such as time series plot have difficulties in conveying impressions from large video collection [9].

In addition, there is a need to present visual contents of videos in compact forms such that user can quickly navigate through different segments of video sequence to locate segment of interest and zoom in to different detail levels [4, 12]. Viewing videos is a time-consuming process, consequently, it is desirable to develop methods for highlighting and extracting interesting features in videos. There are numerous techniques designed for data analysis in images, and a variety of statistical indicators for data processing. On the contrary, there is a lack of effective techniques for conveying complex statistical information spontaneously to a layperson such as a security officer, apart from using line graphs to portray 1D signal levels [4]. Many researchers studied video processing in the context of video surveillance [6], monitoring vehicles, and monitoring crowds [13]. However, the main problem in automatic video processing is communication of results of video processing to human operator. Since statistical results are not easily comprehensible, whereas sequences of difference images again need sequential viewing [4, 14].

Conventional video surveillance systems heavily rely on human operators for activity monitoring and determining actions to be taken upon incident occurrence. There are several actionable incidents that misdetect in such a manual system due to inherent limitations from deploying solely human operators eyeballing CCTV screens [15, 16]. Hence, automatic VV will prove very beneficial in improved traffic management. Misdetections might be caused by monitoring excessive number of video screens to monitor as shown in Fig. 1 and tiredness due to prolonged monitoring. In fact, numerous studies have shown the limits of human-dependent surveillance. The United States Sandia National Laboratories conducted a study in which most people’s attention fell below an adequate level after only 20 min of video surveillance screens monitoring [17]. The video content analysis paradigm is shifting from a fully human operated model to an intelligent machine-assisted automated model [15, 18].

In the field of visualization, Borgo et al. [19] carried out a comprehensive survey on video visualization. Effectiveness of VV for conveying meaningful information enclosed in video sequences was demonstrated by Daniel et al. [4]. Andrienko et al. [20] also illustrated the visual analytical approach for visualization of large amount of data. Data was aggregated and clustered to present on map using color-coded arrows. Wang et al. [21] proposed an approach for situational understanding by combining videos in 3D environment. Romero et al. [22] used VV for analyzing the human behavior by explored activity visualization in natural settings over a period of time.

Hoummady et al. presented a survey on shortcomings of all sensory devices which used for real-time traffic information collection [23, 24] and proposed a usage of video camera as data collection for traffic management. His proposed approach relied on computational device primarily for automatic vehicle and pedestrian recognition, 2-wheel vehicles etc.

For road traffic visualization, most commonly used approach is to color-code the areas representing roads on map [25]. Ang et al. [26] proposed an analytical approach for traffic surveillance from multiple cameras. For estimation of vehicle trajectories, features were extracted. Andrienko et al. [27] also illustrated the visual analytical approach for visualization of large amount of data. Data was aggregated and was clustered to be presented on map using color-coded arrows. In addition, Chen et al. [20] proposed an approach to improve visualization using volume and flow signatures. Their study revealed that ordinary people can learn to recognize events based on event signatures in static visualization rather than having to view contents of the entire video.

Technology providers and end-users identify that manual process alone is insufficient for searching exhaustively colossal amount of video data and for meeting the need for screening timely. To alleviate these issues, we are trying to project outdoor surveillance camera activity to Google Maps that makes it easier to have holistic and summarized view of videos. Vast amounts of video data render ineffective manual video analysis though current automatic video analytics techniques undergo inadequate performance [28].

In this paper, a novel VV technique is proposed and tested on numerous traffic surveillance videos to inherit appropriate visual representations for assisting decision-making process. One can observe pattern and level of activities recorded from visualization as it conveys much more spatial information than statistical indicators. Semantic information is extracted from many traffic surveillance videos that are also linked to Google Map for 3D association. Meanwhile, glyph is individually recognizable and offers multifield visualization [29, 30]. Well-developed glyph-based [31] visualization technique is proposed that can enable effective and efficient visual communication and information encoding.

Morris [32] method only visualize the traffic information using color-coded scheme which is unable to represent whether the vehicle change the lane or not. Whereas the proposed system visualization method is able to visualize the traffic information as well as vehicle lane change information. Proposed glyph-based visualization is helpful for predictive management of smart cities and have better understanding of the environmental activity.

The paper is structured as follows, Section II introduces the proposed methodology and Section III presents experimental results. Section IV highlights important discussions about results. Conclusion is drawn Section V.

2 Proposed methodology

Proposed approach aims to visualize semantic information of traffic videos using time-stamped glyph. Input video frames are processed continuously to detect change in visual information. The proposed approach consists of several steps for estimation of traffic flow as shown in Fig. 2.

Top level diagram depicts the flow of proposed system. A real-time video visualization method proposed where visual information is mapped on Google Map to visualize information in efficient and in effective manner. Obtained information is helpful in making intelligent system for smart cities. Each step output is given as input to succeeding step. In the proposed system, problem of traffic visualization is an enormous amount of visual information visualization at the same time. Motion tracking is performed to obtain the semantic information from surveillance videos. Furthermore, individual tracking of vehicle gives the vehicle position in video space coordinates. After that, 3D computation is performed for visual information visualization on Google Map. Furthermore, individual tracking of vehicle gives the vehicle position in video space coordinates. For visualization of semantic information on Google Map, 3D conversion is performed. Time-stamped individual vehicle information is mapped on map, and glyph visualization enables effective encoding of semantic information.

First step involves the segmentation of object from the surveillance video as shown in Fig. 3 by using thresholding and subsequently converting it to binary image from gray scale image. Parts of road are thinned out, and holes are filled in video frames using morphological operations. Filtration of cars from the frames is performed based on blob size.

In order to visualize the semantic information, vehicles are detected and tracked over successive frames. Detection and tracking system is proposed to detect and to track the object of interest. Moving objects are taken into account in order to reduce the noise [33, 34]. Proposed system also provides vehicle speed and density through visual tracking in order to provide accurate estimation of traffic. Flow of traffic is estimated in each frame (Fig. 6), and the calculations of total number of vehicles are counted for each frame. Mean speed in each frame is calculated for every vehicle. The flow rate is determined by dividing the total number of vehicles over time.

The speed of vehicle is calculated in pixels per second by dividing the distance traveled by the given vehicle over the time taken. The distance is calculated by taking the sum of square difference of centroid values among successive frames, and time is calculated using the frame rate of video. Height of pixels is calculated by dividing the standard distance between central lane marks divided by distance in pixels of central lane marks as shown in Figs. 4 and 5.

For the detection of traffic congestion, a novel fuzzy-based traffic analyzer is presented. A novel fuzzy logic-based framework is used to analyze the extracted data to decide upon the right traffic condition as shown in Figs. 7 and 8. This fuzzy-based analyzer uses traffic input data from video-based analysis. As far as computational complexity is concerned, only two inputs, namely, mean speed and number of vehicles are used. Linear interpolation is used for the approximation of missing data values. The membership functions are used for input to fuzzy inference system. Based on the rules defined for the speed of car and the total number of vehicles present within single frames are counted. Detection of congestion or traffic statistics is decided by the fuzzy inference systems.

As it has been seen, the level of congestion increases with the increase in crash frequency. As congestion of traffic takes place, mean speed of vehicles decreases. Due to the presence of traffic congestion, users are constrained to drive in limited or slow speed. As shown in results, smooth traffic suddenly changes the speed of vehicles as the number of vehicles increases. Figure 7 is of traffic motorway video, which contains 914 frames and has frame rate of 30 fps. From the figure, it is clearly seen that the traffic flow is normal at the start, but there is change in mean speed between frame 220 and 270. And also the number of vehicles increases between these frames.

Object tracking is part of the proposed system which collects temporal and spatial information about the object under consideration from the video sequence. Semantic information such as trajectories of detected objects is obtained from motion tracking which is used as an input for 3D computation and mapping while displaying the results on Google Map. As the coordinates of video space and Google space are different, so in order to perform mapping among the two different spaces, 3D mapping is performed. Time-stamped glyph is generated to represent the semantic information on video and Google space.

2.1 Motion tracking and semantic event display

The importance of motion tracking in videos is undeniable since it is beneficial in many applications. Semantic video analysis [35, 36] is used for extracting only important information notably speed, type of vehicle, and trajectory and lane changing from the video. This information is automatically extracted, which is represented in terms of high level descriptors, indexing, searching, and retrieving the video content. Tracking involves the maintenance of appearance, velocity, and position of each observable object over frame sequences. Object detection is performed by linking each object to the most similar segment in consecutive frames of video.

A common representation of trajectory is tracking of flow vectors. This trajectory information forms the basis for further analysis. As vehicle do not move in a straight line in a given frame. Subsequently, trajectories of vehicle are not smooth. Figures 9 and 10 represents the chaotic trajectories of vehicles taken from three different surveillance videos. Each vehicle trajectory is obtained by tracking the individual detected vehicle.

Figure 11 shows the smooth curves which are obtained by applying the Bezier curve on chaotic trajectories of detected vehicles. Several vehicle trajectories obtained using motion tracking are shown on map as well as video space.

Our proposed approach uses optical flow [37, 38] for tracking the motion flow of objects (Fig. 12). Subsequently, motion vectors represent the video contents at semantic level over time and are capable of representing visual information changes [39]. Optical flow [37, 38] research employs intensity changes of pixels and determination of pixels’ movement in image sequence. An optical flow based system is proposed which detects and tracks the motion vectors. Motion flow estimation is performed frame by frame, and vehicle density is estimated. Thresholding is done to remove the noise in optical flow generation.

Layout of storing vehicle coordinates is shown in Fig. 13. Total number of blobs signifies the number of vehicles within a frame. In Fig. 13, it is illustrated that only a single vehicle was present in current frame. Array is defined to store vehicle coordinates. The first two values of array shows the x and y coordinates of vehicle in first frame whereas the next x and y coordinates of next frame is stored in third and fourth column of array. Value in fifth column illustrates the total count of frames in which vehicle remains visible within the scene. Vanishing flag is in sixth column which describes the vehicle status such as vehicle disappearance. It will be zero until car remains in scene, and when the vehicle vanishes, flag value becomes 1. Vanishing value is important because when the reshuffling of arrays values depends on this value.

Upon the first appearance of vehicle, it stores its coordinates in a new vehicle variable, starts counting frames since the vehicle appears. The count is set to the total number of frames occupied by the vehicle from its entrance to the exit from the video sequence, and the vanishing flag is set to 0 till it remains within the sequence. After it exits the scene, the vanishing flag value will be turned to 1.The corresponding array variable structure is shown in the Figs. 13 and 14.

Until vehicle remains in the scene, its coordinates are saved, and as the vehicle leaves the scene, the buffer defined above is updated, and the vanishing flag value is set to 1 representing vehicle exit the scene. As the vehicle disappears, the values stored in buffer are flushed out; coordinates of the vehicle disappear from that scene. Reshuffling of vehicles values are shown in Fig. 15.

Individual tracking of pixels is possible with motion vectors of each pixel. Consider an image at time t and location I(x, y, t), then the following expression can be used:

$$ \frac{\mathrm{dI}}{\mathrm{dt}}=\frac{\mathrm{dI}}{\mathrm{dx}}\varDelta x+\frac{\mathrm{dI}}{\mathrm{dy}}\varDelta y $$

(1)

Optical flow defines the time rate of pixels and direction in a time sequence of consecutive images which comprises two-dimensional vector having velocity and direction of each pixel. Horn-Schunck method [14] is used for computation of velocity.

$$ {\displaystyle \int \varOmega \Big(\frac{\partial I}{\partial x}\delta x+\frac{\partial I}{\partial y}\delta y}+\frac{\partial I}{\partial t}+\alpha \left(\left|\varDelta \delta x\left.\right|\right.{}^2+\left|\varDelta \delta y\left.\right|\right.\right)\Big) d\;{}_x d_y $$

(2)

Velocity of pixel (x, y) is the neighborhood average of other pixels. Optical flow is estimated from the generated motion vectors by storing the flow vectors as complex numbers. When optical flow is constructed, flow vectors magnitudes are obtained.

However, trajectories are not of same length even traveling along a same route [40] because objects move at a different speed. The purpose of using motion vectors for representing information is to maintain the strong relationship of motion information with semantic event. Identification of different events is performed from video by analyzing the motion features.

A path illustrates the movement of object and sequence of dynamical measurements which can be used to represent raw trajectory. A common representation trajectory is flow sequence such as [40]

$$ {F}_{\mathrm{T}}=\left\{{f}_1,{f}_2,...........{f}_T\right\} $$

(3)

As object is tracked over different frame sequences results in sequences of inferred tracking states which are represented as f ₁, f ₂,........... f _T

The flow vectors are

$$ {f}_{\mathrm{t}}={\left[{x}_t,{y}_t,{v_x}^t,{v_y}^t,{a_x}^t,{a_y}^t\right]}^T $$

(4)

Where f _t represents the object velocity [v _x, v _y], position [x, y], and direction [a _x, a _y] at time t extracted by tracking the object.

Figure 15 represents the chaotic trajectories of vehicles obtained using motion tracking. These trajectories have been taken from two different videos. While Fig. 16 shows the vehicle trajectory and its mapping onto a Google Map.

2.2 3D conversion from video to map

Real-time information capturing is one of the biggest challenges in dynamic VV [36, 41]. 3D information retrieval from video is necessary to get some meaningful information from video. As video frame is a projection of 3D space, extraction of key information is a challenging task. In our approach, 3D conversion from surveillance video to Google Maps is computed by homographic transformation as shown in Fig. 18. In this transformation, a plane is mapped to the image space by a planar projective transformation which maps point in one plane to another plane. Homography between the video space and image space is calculated for requiring a minimum of four-point correspondence (Fig. 17). Image-based calibration is learned through the transformation H, where mapping of image pixels on the ground plane corresponds to the longitude and latitude coordinates of maps.

Single vehicle location in each frame of video sequence is represented by “plus” signs in Fig. 11 by computing the homography matrix for calculation of video and map space coordinates.

In perspective projection, points are similar in two different spaces but not equal due to universal scale ambiguity. The homography [12] in camera-based view geometry attains a particular interpretation H = KE, where E represents Euclidean transformation matrix defining camera pose when viewing through target, and K represents camera perspective matrix known as intrinsic measures. Consider a correspondence point pairs such as p = (x ₁, y ₁, z ₁)^T and u = (x ₂, y ₂, z ₂)^T is related by homography H:

$$ \left[\begin{array}{l}{x}_2\\ {}{y}_2\\ {}{z}_2\end{array}\right] \sim \left[\begin{array}{l}{h}_{11}\ {h}_{12}\ {h}_{13}\ {h}_{14}\\ {}{h}_{21}\ {h}_{22}\ {h}_{23}\ {h}_{24}\\ {}{h}_{31}\ {h}_{32}\ {h}_{33}\ {h}_{34}\end{array}\right]\left[\begin{array}{l}{x}_1\\ {}{y}_1\\ {}{z}_1\end{array}\right] $$

(5)

Thus, each correspondence p ⇄ u results in two linear equations in the unknowns $ h={\left({h}_{11},{h}_{12,...............,{h}_{34}}\right)}^T $. With multiple correspondences, multiple pairs of such linear constraints are collected to obtain a coefficient matrix A. A least-squares solution for h is obtained by solving

$$ \left({A}^{{}^T} A\right) h=0 $$

(6)

The solution for h is obtained as the eigenvector corresponding to the smallest value of A ^T A. The video corner points after the computation of H are projected on the correspondence coordinates points on the Google Map. Each pixel position within video dimension is projected onto a map by using H, and its resultant latitude and longitude coordinates are stored. H inverse is also calculated for mapping the map space coordinates on video space.

2.3 Time-stamped semantic glyph representation

Glyph-based visualization is considered to be a common form of visual design in which collection of visual objects are used to represent data set [9]. Glyph technique is used to visualize motion vectors which are overlaid on every frame of the video stream in this approach. Our major concern is visual information collection which appears in all video frames until the object leaves the field of view. Time-stamped glyph is generated to represent the speed, type of car with different colors, and event information, e.g., lane changes information.

2.4 Bezier fitting for glyph generation

Bezier curve is used for smoothing and modeling the vehicles chaotic trajectories. Bezier curve is defined by control points having suitable geometric interpretation of modeling and having the ability to model trajectories variability [42]. As curve is completely confined in control point’s convex hull, points can be displayed graphically and used for manipulation of curve. Giving point’s P ₁ and P _0, Bezier curve is simple straight line among two points. Linear Bezier curve such as;

$$ \begin{array}{l} B(t)={P}_0+ t\left({P}_1-{P}_0\right)=\left( I- t\right){P}_0+ t{P}_1\\ {}0\le t\le 1\end{array} $$

(7)

That is equivalent to linear interpolation.

Bezier curve is used to smooth the chaotic trajectories of vehicles obtained using motion tracking. As each car moves with different speed, so length of trajectories varies.

Figure 19a illustrates the chaotic trajectories of different vehicles that are smoothed using Bezier curve to visualize the traffic pattern as shown in Fig. 19b. Time-stamped semantic information is represented using glyph Figure 19c depicts the video taken from area around Northumbria University having frame rate 30 fps and video resolution 1920 × 1080. Video consists of 25 frames.

Trajectory of car is tracked frame by frame, and semantic information is provided as shown in Fig. 20. Glyph outer circle represents that car changes the lane whereas type of vehicle was small that is represented by red circle. If type of vehicle is small and car remains within lane throughout the scene than glyph, outer circle will be green.

2.5 Association between Google Map and video visualization

To properly visualize the analysis of results on Google Maps, the output must be properly aligned to the map coordinates [31, 43]. Rectification of camera image is automatically done and mapped on to a Google Map. We detect the activity in each frame of surveillance video, and the position of moving object on ground plane is learned through trajectory learning, and correspondence points are mapped on to Google Map. In this way, video observation of many surveillance cameras at same time is improved by projecting outdoor surveillance camera activity on to Google Maps.

2.6 Holistic view of video using Google Map

A video typically captures the perspective projection of scene known as quasi-3D [44, 45]. Meaningful information collected from different surveillance videos are viewed for representation of unusual events in the video as shown in Fig. 21.

3 Experimental regards and discussion

Different frames of video are captured to display the results of motion tracking. Experimental results are shown by applying our algorithm on videos of different resolutions and frame rates for evaluation. Different locations surrounding the Northumbria University, Newcastle upon Tyne, UK has been taken into account for capturing traffic videos. It comprises of stacking temporally spaced intervals of key video frames, and color-filled glyphs are used for representing semantic information to summarize the motion flow.

3.1 Event-aware visualization using semantic glyph

In surveillance system, detection of unusual events is the most important task [46]. Abnormal behavior can be subtle and drastic. Lane changing on highways is stressful. Our proposed system accurately determines the lane change of vehicle at a specific time due to precise localization. Abnormal events detection can be done by specifically giving trajectory [15, 27]. Consequently, here the trajectory of vehicle indicates the threatening behavior by trajectory analysis. Different color of glyph during visualization depicts the vehicle position, type, and event information at various locations within video frame. The time-stamped glyph-based visualization of video of 1st, 5th, and 10th, 15th, 20th, and 25th frames are illustrated in Fig. 22. In each frame car location changes that is mapped onto geographical map.

It has been shown in Fig. 22 that car changes the lane in end of frames which is represented by glyph having red color.

3.2 Small-scale

We have tested our proposed technique on small-scale such as area around City Campus Northumbria University Newcastle upon Tyne, UK. In the outcome, trajectories of detected objects are shown until the objects remains in the scene using semantic glyph as shown in Fig. 23. The framework was tested on 32 videos of 1 h duration. Proposed system is able to detect and track vehicles even in the dense crowd.

4 Future work

There is a scope of future work in visualization area. Proposed technique can be extended for city-level traffic management system having summarized view of large site. Spatio-temporal overview of larger collection of surveillance videos can be obtained by mapping it onto a Google Map as shown in Fig. 24.

5 Discussion

For interpretation of data by user in real-time system, data must be visualized to offer intuitive information which can be used for obtaining patterns and trends. However, inferring statistics from incoming data automatically are computationally expensive. Consequently, Walton et al. [36] projected the live traffic videos on to maps for displaying traffic information. However, displaying several traffic videos simultaneously was difficult due to heavy transmission load. Human intelligence is used to infer semantic details from videos in visual mapping strategy. Recently, Cheng [47] presented a system for traffic situation visualization by interpreting statistical vehicle detector data and by composing videos in a database. Traffic flow was estimated from videos and, mapping was built among videos and vehicle detector data. When visualizing traffic situations, they are ineffective to simulate all kinds of dynamics and kinematics due to unknown driving behavior in different regions.

6 Conclusions

VV is concerned with visual representation of input video for revealing important events and features in the video. So far, it is intended to assist in intelligent reasoning whilst alleviating the burden of viewing videos. A novel glyph-based visualization approach has been developed that can be effectively used for surveillance video. A visual activity analysis is performed based on motion tracking for monitoring live traffic on highways. The proposed solution has been tested on multiple video resolutions and frame rates for visualization of traffic flows. Experimental results show algorithm is credible enough to be deployed in field conditions and enable better utilization of the existing video-based traffic management systems.

References

HM Dee, SA Velastin, How close are we to solving the problem of automated visual surveillance. Mach. Vis. Appl. 19(5-6), 329–343 (2008)
Article Google Scholar
MM Yeung, BL Yeo, Video visualization for compact presentation and fast browsing of pictorial content. IEEE Trans. Circuits Syst. Video Technol. 7(5), 771–785 (1997)
Article Google Scholar
G Andrienko, N Andrienko, A visual analytics approach to exploration of large amounts of movement data, in International Conference on Advances in Visual Information Systems (Springer, Berlin, 2008), pp. 1–4
Google Scholar
G. Daniel, M. Chen, in Proceedings of the 14th IEEE Visualization 2003 (VIS'03). Video visualization, (IEEE Computer Society, 2003), p. 54
A. Cavallaro & T. Ebrahimi, in IEEE International Symposium on Circuits and Systems. Change detection based on color edges, (IEEE; 1999, 2001), No. 2, pp. 141-144
R. T. Collins, A. J. Lipton, T. Kanade, H. Fujiyoshi, D. Duggins, Y. Tsin, … & L. Wixson, in A system for video surveillance and monitoring. (Technical Report CMU-RI-TR-00-12, Robotics Institute, Carnegie Mellon University, 2000), p. 1-6
P. Dollár, V. Rabaud, G. Cottrell & S. Belongie, in 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance. Behavior recognition via sparse spatio-temporal features, (IEEE, 2005), p. 65-72
A. Girgensohn, D. Kimber, J. Vaughan, T. Yang, F. Shipman, T. Turner, … & T. Dunnigan, in Proceedings of the 15th ACM international conference on Multimedia. DOTS: support for effective video surveillance, (ACM, 2007), p. 423-432
B. Duffy, J. Thiyagalingam, S. Walton, D. J. Smith, A. Trefethen, J. C. Kirkman-Brown, … & M. Chen, Glyph-based video visualization for semen analysis. IEEE Trans. Vis. Comput. Graph. 21(8), 980-993 (2015).
P Kumar, S Ranganath, H Weimin, K Sengupta, Framework for real-time behavior interpretation from traffic video. IEEE Trans. Intell. Transp. Syst. 6(1), 43–53 (2005)
Article Google Scholar
M Höferlin, K Kurzhals, B Höferlin, G Heidemann, D Weiskopf, Evaluation of fast-forward video visualization. IEEE Trans. Vis. Comput. Graph. 18(12), 2095–2103 (2012)
Article Google Scholar
C. Xu, J. Liu & B. Kuipers, in Computer and Robot Vision (CRV), 2011 Canadian Conference on. Motion segmentation by learning homography matrices from motor signals, (IEEE, 2011), p. 316-323
C. Yan, Y. Zhang, F. Dai & L. Li, in Data Compression Conference (DCC), 2013. Highly parallel framework for HEVC motion estimation on many-core platform, (IEEE, 2013), p. 63-72
C Yan, Y Zhang, F Dai, J Zhang, L Li, Q Dai, Efficient parallel HEVC intra-prediction on many-core processor. Electron. Lett. 50(11), 805–806 (2014)
Article Google Scholar
CC Loy, Activity understanding and unusual event detection in surveillance videos (Doctoral dissertation), 2010
Google Scholar
C Yan, Y Zhang, F Dai, X Wang, L Li, Q Dai, Parallel deblocking filter for HEVC on many-core processor. Electron. Lett. 50(5), 367–368 (2014)
Article Google Scholar
DB Goldgof, D Sapper, J Candamo, M Shreve, Evaluation of Smart Video for Transit Event Detection, 2009. No. Report No. 2117-7807-00
Google Scholar
C Yan, Y Zhang, J Xu, F Dai, L Li, Q Dai, F Wu, A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal Process Lett. 21(5), 573–576 (2014)
Article Google Scholar
R Borgo, M Chen, B Daubney, E Grundy, G Heidemann, B Höferlin, X Xie, A survey on video-based graphics and video visualization, in Eurographics (STARs), 2011, pp. 1–23
Google Scholar
M. Chen, R. Botchen, R. Hashim, D. Weiskopf, T. Ertl & I. Thornton, Visual signatures in video visualization. IEEE Trans. Vis. Comput. Graph. 12(5), (2006)
Y Wang, DM Krum, EM Coelho, DA Bowman, Contextualized videos: Combining videos with environment models to support situational understanding. IEEE Trans. Vis. Comput. Graph. 13(6), 1568–1575 (2007)
Article Google Scholar
M Romero, J Summet, J Stasko, G Abowd, Viz-A-Vis: Toward visualizing video through computer vision. IEEE Trans. Vis. Comput. Graph. 14(6), 1261–1268 (2008)
Article Google Scholar
B Hoummady, U.S. Patent No. 6,366,219 (Patent and Trademark Office, Washington, DC, 2002)
Google Scholar
C Yan, Y Zhang, J Xu, F Dai, J Zhang, Q Dai, F Wu, Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans. Circuits Syst. Video Technol. 24(12), 2077–2089 (2014)
Article Google Scholar
S. Shekhar, C. T. Lu, R. Liu & C. Zhou, in Intelligent Transportation Systems, 2002. Proceedings. The IEEE 5th International Conference on. CubeView: a system for traffic data visualization, (IEEE, 2002), p. 674-678
D. Ang, Y. Shen & P. Duraisamy, in Proceedings of the Second International Workshop on Computational Transportation Science. Video analytics for multi-camera traffic surveillance, (ACM, 2009), p. 25-30
F. Jiang, Y. Wu & A. K. Katsaggelos, in 2007 IEEE International Conference on Image Processing. Abnormal event detection from surveillance video by dynamic hierarchical clustering, vol. 5 (IEEE, 2007), p. V-145
RD Dony, JW Mateer, JA Robinson, Techniques for automated reverse storyboarding. IEE Proc. Vis. Image Signal Process. 152(4), 425–436 (2005)
Article Google Scholar
J. Fuchs, F. Fischer, F. Mansmann, E. Bertini, & P. Isenberg, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Evaluation of alternative glyph designs for time series data in a small multiple setting, (ACM, 2013), p. 3237-3246)
R. Borgo, J. Kehrer, D. H. Chung, E. Maguire, R. S. Laramee, H. Hauser, … & M. Chen, in Eurographics (STARs). Glyph-based Visualization: Foundations, Design Guidelines, Techniques and Applications, (2013), p. 39-63
MO Ward, Multivariate data glyphs: Principles and practice, in Handbook of data visualization (Springer, Berlin, 2008), pp. 179–198
Chapter Google Scholar
BT Morris, C Tran, G Scora, MM Trivedi, MJ Barth, Real-time video-based traffic measurement and visualization system for energy/emissions. IEEE Trans. Intell. Transp. Syst. 13(4), 1667–1678 (2012)
Article Google Scholar
V KR, LM Patnaik, Moving vehicle identification using background registration technique for traffic surveillance, in Proceedings of the International MultiConference of Engineers and Computer Scientists, vol. 1, 2008
Google Scholar
SCS Cheung, C Kamath, Robust background subtraction with foreground validation for urban traffic video. EURASIP J. Adv. Signal Process. 2005(14), 726261 (2005)
MATH Google Scholar
S Liu, H Yi, LT Chia, D Rajan, S Chan, Semantic analysis of basketball video using motion information, in Pacific-Rim Conference on Multimedia (Springer, Berlin, 2004), pp. 65–72
Google Scholar
S. Walton, M. Chen & D. Ebert, LiveLayer: Real-time Traffic Video Visualisation on Geographical Maps
BK Horn, BG Schunck, Determining optical flow. Artif. Intell. 17(1-3), 185–203 (1981)
Article Google Scholar
S Aslani, H Mahdavi-Nasab, Optical flow based moving object detection and tracking for traffic surveillance. Int. J. Electrical Electron. Commun. Energy Sci. Eng. 7(9), 789–793 (2013)
Google Scholar
CP Kappe, L Schütz, S Gunther, L Hufnagel, S Lemke, H Leitte, Reconstruction and Visualization of Coordinated 3D Cell Migration Based on Optical Flow. IEEE Trans. Vis. Comput. Graph. 22(1), 995–1004 (2016)
Article Google Scholar
BT Morris, MM Trivedi, A survey of vision-based trajectory learning and analysis for surveillance. IEEE Trans. Circuits Syst. Video Technol. 18(8), 1114–1127 (2008)
Article Google Scholar
D. Beymer, P. McLauchlan, B. Coifman & J. Malik, in Computer Vision and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer Society Conference on. A real-time computer vision system for measuring traffic parameters, (IEEE, 1997), p. 495-501
J Faraway Julian, P Matthew Reed, W Jing, Modelling three‐dimensional trajectories by using Bézier curves with application to hand motion. J. R. Stat. Soc. Ser. C. Appl. Stat. 56.5(07), 571–585 (2007)
Article Google Scholar
B Morris, MM Trivedi, Contextual activity visualization from long-term video observations, in University of California Transportation Center, 2010
Google Scholar
C. T. Lu, A. P. Boedihardjo & J. Zheng, in 22nd International Conference on Data Engineering (ICDE'06). Aitvs: Advanced interactive traffic visualization system, (IEEE, 2006), p. 167-167
CY Hsieh, YS Wang, Traffic situation visualization based on video composition. Comput. Graph. 54, 1–7 (2016)
Article Google Scholar
G Medioni, I Cohen, F Brémond, S Hongeng, R Nevatia, Event detection and analysis from video streams. IEEE Trans. Pattern Anal. Mach. Intell. 23(8), 873–889 (2001)
Article Google Scholar
Video visualization. in Proc. IEEE Visualization 2003, Seattle, p. 409–416, 2003. H. Denman, N

Download references

Acknowledgements

This research is supported by the Higher Education Commission of Pakistan, research grant no.: 1-8/HEC/HRD/2015/3719 to the Computer Engineering Department at National University of Sciences and Technology Pakistan and to the Computer Department at the Northumbria University, Newcastle Upon Tyne, UK.

Funding

Funding for this research is provided by the National University of Sciences and Technology Pakistan.

Authors’ contributions

The rest of authors are all my supervisors in NUST University Pakistan and Northumbria University, Newcastle Upon Tyne, UK. The work is part of a split PhD program between NUST and Northumbria University, Newcastle Upon Tyne, UK. All members have contributed significantly to the technique or methods used, to the research concept, to the data collection, to the experiment design, and to the critical revision of the article. The work started after the unanimous approval of topic by the supervisors from the both universities. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author information

Authors and Affiliations

National University of Sciences and Technology, Islamabad, Pakistan
Fozia Mehboob, Muhammad Abbas, Saad Rehman & Shoab A. Khan
Computer and Information Sciences, Northumbria University, Newcastle, UK
Fozia Mehboob, Richard Jiang & Ahmed Bouridane

Authors

Fozia Mehboob
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Abbas
View author publications
You can also search for this author in PubMed Google Scholar
Saad Rehman
View author publications
You can also search for this author in PubMed Google Scholar
Shoab A. Khan
View author publications
You can also search for this author in PubMed Google Scholar
Richard Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Bouridane
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fozia Mehboob.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Mehboob, F., Abbas, M., Rehman, S. et al. Glyph-based video visualization on Google Map for surveillance in smart cities. J Image Video Proc. 2017, 28 (2017). https://doi.org/10.1186/s13640-017-0175-4

Download citation

Received: 30 May 2016
Accepted: 02 March 2017
Published: 12 April 2017
DOI: https://doi.org/10.1186/s13640-017-0175-4

Glyph-based video visualization on Google Map for surveillance in smart cities

Abstract

1 Introduction

2 Proposed methodology

2.1 Motion tracking and semantic event display

2.2 3D conversion from video to map

2.3 Time-stamped semantic glyph representation

2.4 Bezier fitting for glyph generation

2.5 Association between Google Map and video visualization

2.6 Holistic view of video using Google Map

3 Experimental regards and discussion

3.1 Event-aware visualization using semantic glyph

3.2 Small-scale

4 Future work

5 Discussion

6 Conclusions

References

Acknowledgements

Funding

Authors’ contributions

Competing interests

Publisher’s Note

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords