Motion Mining & Detection in Video Streams
February 27, 2012 Leave a comment
Motion Mining & Detection in Video Streams
Ali Tarhini, Fatima Hamdan, Hazem Hajj
Department of Electrical and Computer Engineering
American University of Beirut
Abstract–Motion detection is the process of finding and extracting motion in continuous media. There are many approaches for motion detection in continuous video streams. All of them are based on comparing the current video frame with one from the previous frames or with something known as the “background”. This method is useful in video compression when estimating changes in the frame is the only requirement, not the whole frame. Although this algorithm is simple, it suffers from disadvantages. If the object is moving smoothly, only small changes from frame to frame is detected. Another problem is when the object is moving slowly, the algorithms will not give any useful result at all. Another approach is to compares the current frame to the first frame in the video sequence rather than comparing with the previous one. But, this approach fails in practice because it will always have motion detected on the same region since the initial frame is static. In this paper, we propose a new method which overcomes the problems of previous methods by creating a moving frame over a time interval and comparing against this frame. Our tests show that the proposed algorithm outperforms the previous ones with speed and accuracy.
Keywords: Motion Mining, Video Stream, Detection.
Motion detection is the process of finding and extracting motion in continuous media. There are many approaches for motion detection in continuous video streams. All of them are based on comparing the current video frame with one from the previous frames or with something known as the “background”. One of the most common approaches is to compare the current frame with the previous one. This method is useful in video compression when estimating changes in the frame is the only requirement, not the whole frame. Although this algorithm is straightforward in motion mining, it suffers from disadvantages. If the object is moving smoothly, only small changes from frame to frame is detected. So it’s impossible to get the whole moving object. Another problem is when the object is moving slowly, the algorithms will not give any useful result at all. There is another approach that compares the current frame the first frame in the video sequence rather than comparing with the previous one. So, if there were no objects in the initial frame, comparison of the current frame with the first one will give us the whole moving object independently of its motion speed. But, this approach fails in practice because if there was, for example, a car on the first frame, but then it is gone then this method will always have motion detected on the same place where the car was. Although one workaround over this problem is to renew the initial frame at regular intervals, but still it will not yield good results in the cases where the probability of guaranteeing that the first frame will contain only static background. But there can be an inverse situation. For example, if we add a picture on the wall in the room we will get motion detected continuously until the initial frame is renewed.
The most efficient motion mining algorithms are based on building a “background”, also known as the “scene” and comparing each current frame with that scene. There are many approaches to build the scene, but most of them are too complex and require significant computation time which consumes excessive processing power and introduce latency which is not accepted in real time systems. In this paper, we introduce a new approach for motion mining using the scene building method while improving performance over the current algorithms.
BACKGROUND & PREVIOUS WORK
One of the widely known algorithms in motion detection is the Mask Motion Object Detection(MMOD) algorithm. This algorithm works in steps that involve background subtraction, frame difference, frame difference mask and background difference mask are generated and utilized to detect moving objects. First, a threshold value is calculated under the assumption that camera noise obeys Gaussian distribution and background change is caused mainly by camera noise. The frame difference is found by subtracting the current frame from the previous frame. The background difference is found by subtracting the current frame from the background frame. Then the frame difference mask and the background difference mask are found by comparing the obtained frame difference and background difference with the threshold. N(x, y, t) denotes the total frame number that pixel (x,y) may belong to background region. If N(x, y, t) >= fthreshold which shows that the probability which pixel belong to background region is high, then the background difference mask is used to detect moving objects otherwise the frame difference mask is used. To detect moving objects, the current frame is anded with the chosen mask.
Motion Region Estimation Technique(MRET) is another algorithm which also uses image subtraction for motion detection. It starts by processing the output image obtained from image subtraction after thresholding and noise removal. Thresholding determine the areas of output image (from subtraction) consists of pixels whose values lie within the threshold value range. Threshold value also indicates the sensitivity of motion to
detect. After thresholding, the image may still contain a small amount of noise. Noise is removed by using median filtering method. Median filtering is more effective than convolution when the goal is to simultaneously reduce noise and preserve edges. The gray level of each pixel is replaced by the median of the gray levels in a neighborhood of that pixel. Motion region information can be obtained by using AND and OR double difference image IO during different time t and frame n from video frames.
Given a background frame representing the initial frame, a current frame representing the latest frame captured from the video source. As a preprocessing approach, both the background frame and the current frame are converted to grayscale. These grayscale images will be used throughout the rest of the algorithm. The algorithm then proceeds to compare the current frame with the background frame. But this would give the results we discussed earlier that suffer from many problems. The tweak here is to “move” the background frame in time at regular intervals to approach the current frame, but not reach it because if the two frames match, the difference becomes zero and therefore motion will not be determined. In order to approach the current frame from the background frame, the color value of the pixels in the background frame is changed by one level per frame.
The algorithm takes two parameters as input, the background frame and the current frame then it works as follows:
Create a grayscale version of the background frame
Create a grayscale version of the current frame
Subtract the current frame from the background frame to yield the difference frame
Render the difference on screen
Check whether X frame count have passed(X=threshold)
If step5 evaluates to true then merge background frame with the current frame. Otherwise go back to step 3 and repeat.
The goal of this experiment was to study the performance of the proposed algorithm. A camera is setup in a room and its video streams is fed into 3 motion detection algorithms. The first one uses the background for comparison, the 2nd one used the previous frame for comparison and the 3rd one is our proposed algorithm that uses a moving background for comparison with the current frame.
Figure1: Comparison with background
Figure2: Comparison with previous frame
Figure3: Comparison with moving background
In figure1, the comparison with the background frame detects the entire person in the picture. This looks good for the first sight but it’s obvious that the approach is not feasible in practice because the background happened to be static, were it dynamic(i.e clock on the wall)…
In figure2, the person was walking slowly and with minimum movement. As we can see, the detected region of motion appears to be cluttered and disconnected.
In figure3, the person was also moving slowly and with minimum movement. We can see the great improvement in the detected region as all the body was detected as one object. The reason for the improvement from the algorithm used in figure2 is that the comparison was done with a combination of previous frames in order to approximate the state of the background frame a little while ago. This approach is better because if the comparison was with the previous frame, only a small difference will be detected. This shows that our algorithm outperforms the previous algorithms.
In this paper, a motion detection algorithm was implemented by applying a new approach to improve the accuracy of existing motion detection algorithms. The implementation this technique increased the efficiency of the detected region when the object is subject to slow and/or smooths movement.