An abnormal situation is defined as a situation different from general behaviors and environment from the standard personal, statistic, socio-cultural and professional viewpoint. It is classified into intrusion, arson, loitering, fall and assault. There are various conventional algorithms for detecting abnormal situations depending on the type of detection, and conventional intelligent surveillance systems use computers or small devices.
Abnormal situation detection algorithm
The initial face recognition method for detecting intruder is to use principal component analysis (PCA) to recognize faces by using the eigenface obtained through PCA in the face image [6]. Other exemplary methods include the method for using images reconfigured with the discrete cosine transform (DCT) coefficient for the face images to improve PCA performance [7]; the method for using combination of PCA with linear discriminate analysis (LDA) [8]; and the method using PCA and LDA for the face image by the distances of 1 to 5 m [9]. Another method is to extract features with local binary pattern (LBP) to classify them by using Convolution Neural Networks (CNN) [10]. The last method is to recognize faces with Restricted Boltzmann Machines (RBM) of deep learning to use face expression change, lighting change and changing angular images [11].
The method for detecting fires is divided into the method for using sensors and the method for using images. The method for using sensors is divided into the method for detecting fires by using the values obtained by using temperature sensors, smoke sensors and CO2 sensors as parameters of the Fuzzy logic [12], and the method for using a device made by combining 8 sensors of AMS MOX sensors, PID sensors, and NDIR CO2 sensors et [13].
The color-based fire detection method is divided into the method for using additional information including spreading fires after detecting fire colors in the RGB color space [14], the method for detecting fires by calculating standard deviations of colors in various fire environments [15], and the method for detecting fires by using features of HIS color space and optical flows [16]. Another method is to detect fires with support vector machine (SVM) by using features of HIS color space, 2-dimensional discrete wavelet transform (DWT), pixel ratios and optical flow [17].
The method for identifying loitering is divided into the method for using time to measure the time of an object staying in an image [1, 18, 19], and the method dividing an image inputted into blocks of n by m to measure the time in each block [20, 20]. This is a method for using the characteristics that a loitering person shows more directional changes than normal people to measure object angles [22]. The method for identifying loitering by using 2 conditions for determining loitering is divided into the method for measuring the object time and the block time [23], the method for measuring the object time and using angles [24], and the method for using angles and the object movement distance [25].
The method for detecting fall is divided into the method for using sensors and the method for using images. The method for using sensors is divided into the method for attaching 3-axis accelerometer sensors to a body to use sensor values and acceleration values [26], and the method for using accelerometer sensors and pressure sensors [27]. Moreover, a method for using accelerometer sensors of mobile phones is the method for saving normal acceleration patterns as Activities of Daily Living (ADL) to compare them with ADL on the basis of real-time nearest neighbor rule (NNR) [28].
The method for using images is divided into the method for setting up a circle changing depending on object motion, and using circle changes, vertical and horizontal histogram feature values to determine fall with SVM [29], the method for setting up a bounding box in a detected object and using changing acceleration of the bounding box [30], and the method for using aspect ratios, effective area ratios of a concerned object, object feature points, axis angles, and contour ratios to determine fall with SVM [31].
Intelligent video surveillance system
The intelligent video surveillance system for issuing an alert when an abnormal situation occurs in video is divided into the method for using computers and the method for using small devices. The computer-based intelligent video surveillance system conducts various detections, and have been studied. However, they involve installation and maintenance costs, high power consumption and personal information leaks, and are thus not ideal to be used in real environment.
An exemplary conventional computer-based intelligent video surveillance system is an object tracking system by using a plurality of cameras. The videos inputted by digital signal processor (DSP) based IP cameras are encoded by using audio video coding standard (AVS) and sent to the IP network by means of real-time streaming protocol (RTSP). The IP network uses GPU for reducing processing time to conduct distributed processing [32].
Second, another system is the fall detection system based on body contours. This is a system for using videos inputted by cameras in real time so that computers can detect fall and send the information to a hospital server through its network to save and monitor the information. The fall detection system uses Gaussian mixture model (GMM) in the inputted images to detect objects and uses aspect ratios and tilt angle features of a human body contour with less computation for determining the fall [33]. The computer-based intelligent video surveillance system sends the videos inputted by each camera to the central server. The central server conducts various detections in combination, for example, intruder detection, fire detection, loitering detection, and fall detection.
An exemplary conventional intelligent surveillance system based on small devices is the intruder detection system using Raspberry Pi and Arduino. The system detects motions in inputted images by using MOG2, and determines human bodies by using sizes thereof. The system detects faces by using Haar-like features, and detects intruders through fisherfaces-based face recognition. When an intruder is identified, a relevant user is issued with an alert through e-mail, and can view the video remotely through a web interface [34].
Second, another exemplary system for detecting intruders uses Raspberry Pi. When a motion is sensed in an inputted video, this system saves the sensed video in the cloud server for later examination. This system detects object s by using differential images, and uses Haar-like features to determine human bodies. When the object is identified as a human body, this system uses the GSM module to send messages to a relevant user in order to notify the user of the abnormal situation [35].
Last, the unmanned aerial vehicle (UAV) fire detection system uses QuaRC-based single Gumstix. This system uses fire color information and motion information to determine fires, and uses the Lab color space to use color information. It also uses optical flow to use motion information. This system uses sliding mode control (SMC) and linear quadratic regulator (LQR) to reduce calculation time and prevent chattering [36].
Optimization
There are some studies that reduce processing time by using various optimization methods when using small devices with lower specifications than a computer. First, there is a study that reduces processing time in low specification mobile environments by changing the front-end browser loading method. By classifying data types such as text and images, a text layout with a relatively low load is displayed to improve the experience speed, while rendering reconstruction is performed after the screen is displayed. In addition, image size is determined to reduce processing time by reducing or decreasing image quality when rendering takes a long time [37].
T.W. Lee conducted research to reduce embedded module boot time and speed processing of applications. Software suspend using the principle that the memory state and registers are changed when the program is changed from the operating state to the suspend state, the root file system that improves the decompression efficiency to reduce weight, the JFFS2 file system with high compression efficiency and low process usage, etc. The embedded modules used in the experiment is the XP-100 model manufactured by Huins [38].
J.W. Kang analyzed the structural features and speed-reduction factors of mobile enterprise application platform (MEAP) and then performed research to reduce the processing speed of mobile applications using front-end optimization speed improvement techniques. Add an expansion or a cache-control header is carried out to request process resources from a local location, not from a server. Saving server resources required for compression using gzip components, optimization using minimize HTTP requests is conducted to reduce HTTP requests by merging scripts divided [39].
Finally, there is a study by C. C. Paglinawan, who performed optimization on raspberry pi. C. C. Paglinawan performed optimization for the real-time operation of the vehicle speed calculation system, and shortened the vehicle detection time by developing the GMM for vehicle detection and the Kalman Filter (KF) for vehicle tracking in OpenCV. In addition, sparse random projection (SRP) using scikit-learn reduced processing time through image compression that projects high-dimensional video frames into low-dimensional partial spaces [40].
The intelligent video surveillance system based on small devices uses videos inputted by cameras to conduct single detections, for example, intruder detection, fire detection and motion detection. It is required to further study the intelligent video surveillance system based on small devices which conducts integrated detection in order to address weakness of personal information leaks and high power consumption of the computer-based intelligent video surveillance system and use the system efficiently in real environment. In addition, the optimization process is required for the real-time operation of intelligent video surveillance system in low specification devices.