Motion Detection with OpenCV and Python

Besides its useful in security cameras I first became interested in deploying motion detection on a camera I built using a Raspberry Pi 3 Model B with an attached telephoto lens. Its purpose was to detect vehicles passing by and taking a photo with enough clarity and magnification to read its license plate. To accomplish this I used and modified the pyimagesearch code found at:

Pyimagesearch

It uses image processing found in opencv and is presented to be lightweight enough to run on the RPi3. I quickly found that while it works well for an indoor scene that is quite static in nature, it had to be modified to work for outdoor scenes with changing lighting conditions, moving shadows, trees blowing in the wind, and all sorts of changes that one wants to ignore. So I came up with the code presented here that works well for detecting passing vehicles while ignoring things like someone walking by among other things.

General Approach

Stepping Through the Code

While Loop

Pi Camera

Frame Capture of Vehicle

Depiction of Operation of the Code

Logged Output of Pi Camera

Final Comments

General Approach

The pyimagesearch code essentially uses image substraction and contours to both detect motion and track the object. Its sensitivity adjustment is mainly to ignore contours that fall below a minimum area threshold. The reference frame that is subtracted from subsequent frames remains unchanged.

I quickly discovered a couple of things. First, upon starting up the python script turning on the camera requires sensor stabilization before frames are used. Second, the reference frame (called firstFrame in the code) must be updated often and repeatedly. I found that an update on every 20th loop was a good compromise. Third, besides using the area of a contour to decide upon an object, the number of points in the contour can also be used with some effect as well as varying the degree of blur.

I also added a few other things. Information is printed during each loop. Each loop is timed in order to understand the ability for the RPi3 to handle the processing load. Images are saved in one of two ways. To fulfill the original purpose full HD images are saved as png files in order to read license plates. Alternatively, reduced size images are saved with a box drawn on the object and a time stamp on the image.

Stepping Through the Code

If not already installed, do pip install imutils . The following are the preliminaries prior to the while loop.

# import the necessary packages
from imutils.video import VideoStream
import argparse
import datetime
import imutils
import time
import cv2

 
# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", help="path to the video file")
#default changed to 1000 from 500
ap.add_argument("-a", "--min-area", type=int, default=1000, help="minimum area size")
args = vars(ap.parse_args())
 
# if the video argument is None, then we are reading from webcam
if args.get("video", None) is None:
	vs = VideoStream(src=0, usePiCamera = True, resolution = (1920, 1080), framerate = 30).start()
	time.sleep(2.0)
        
 
# otherwise, we are reading from a video file
else:
	vs = cv2.VideoCapture(args["video"])
 
# initialize the first frame in the video stream
firstFrame = None

#we will count frames
frame_cnt = 0
image_index = 1
num_saved_image = 0
save_flag = 0

The option to input a video file instead of using the camera is useful when debugging. Note the particular syntax for VideoStream when using a Pi camera. The sleep time of 2 seconds was enough to stablize the camera sensor. Finally, some constants are initialized that are subsequently used.

While Loop

The rest of the code consists of the while loop. It starts out reading a frame, rotating the image if needed, resizing it to fewer pixels, converting it to grayscale, then blurring it. This is where one can fine tune things by either increasing or decreasing the blur. If this is the very first time through the while loop, then this frame “gray” is copied to “firstFrame” and the while loop begins over again.

# loop over the frames of the video
while True:
        #measure loop time
        start_loop_time = time.clock()

	# grab the current frame and initialize the occupied/unoccupied
	# text
	frame = vs.read()
        frame = imutils.rotate(frame, 180)
	frame = frame if args.get("video", None) is None else frame[1]
	text = "Unoccupied"
 
	# if the frame could not be grabbed, then we have reached the end
	# of the video
	if frame is None:
		break
 
	# resize the frame, convert it to grayscale, and blur it
        # the original frame is kept for possible saving to a file
	frame1 = imutils.resize(frame, width=500)
	gray = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY)
        #blur numbers must be odd numbers - even numbers throw an error
	gray = cv2.GaussianBlur(gray, (21, 21), 0) 
#	gray = cv2.GaussianBlur(gray, (41, 41), 0)


        #the background frame (firstFrame) is renewed every so often
        #this solves a lot of the problems and makes this program work
        frame_cnt = frame_cnt + 1
 
	# if the first frame is None, initialize it
	if firstFrame is None:
		firstFrame = gray
		continue

At this point some serious image processing begins to determine if the image has changed. The captured scene has been processed into frame “gray” and the background scene “firstFrame” is now substracted from it producing a grayscale frame labelled “frameDelta”. “frameDelta” should be solid gray if no new object has come into the scene relative to “firstFrame”. To make further processing easier we want not solid gray but solid black, so this frame is thresholded producing frame “thresh1”. Note that one may need to adjust the threshold constant to get this right. Finally, “thresh1” is dilated to render the final frame “thresh”.

	# compute the absolute difference between the current frame and
	# first frame
	frameDelta = cv2.absdiff(firstFrame, gray)
#	thresh = cv2.threshold(frameDelta, 25, 255, cv2.THRESH_BINARY)[1]
        thresh1 = cv2.threshold(frameDelta, 50, 255, cv2.THRESH_BINARY)[1]
 
	# dilate the thresholded image to fill in holes, then find contours
	# on thresholded image
	thresh = cv2.dilate(thresh1, None, iterations=2)

We are now ready to find contours. If there have been changes in the scene relative to the most recent reference background, the “thresh” frame will contain regions of white on the black unchanged portions of the scene. A contour is really an array of Point arrays, each Point array bounding a region of white. In the following len(cnts) is the number of Point arrays found (number of white regions).

	cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
		cv2.CHAIN_APPROX_SIMPLE)
        print "number of contours",len(cnts)
	cnts = cnts[0] if imutils.is_cv2() else cnts[1]
        num_arrays = len(cnts)
        print "number of numpy arrays in first contour",num_arrays
        if num_arrays == 0:
                num_points = 0  #so that firstFrame update works

If num_arrays is nonzero, then the following loop will be entered. Otherwise, there are no “cnts” for the for loop to iterate and nothing to process further.

	# loop over the contours
	for c in cnts:
                num_points = len(c)
                print "number of points in numpy array",num_points
		# if the contour is too small, ignore it
		if cv2.contourArea(c) < args["min_area"]:
			continue
 
		# compute the bounding box for the contour, draw it on the frame,
		# and update the text
		(x, y, w, h) = cv2.boundingRect(c)
		cv2.rectangle(frame1, (x, y), (x + w, y + h), (0, 255, 0), 2)
		text = "Occupied"
                print "number of points in numpy array greater than min_area",len(c)
#               The following condition will filter out the man walking but allow the cars 
#               Also the start up frames will be ignored.
                if len(c) > 130 and image_index > 0:
                        #don't save every image- only every third
                        if image_index % 1 == 0:
                                num_saved_image += 1
                                print "saving image number: " + str(num_saved_image)
#                                cv2.imwrite("images/frame%d.png" % image_index, frame)
                                save_flag = 1
                image_index += 1

Inside the for loop each Point array is evaluated. The number of points in the array is obtained and the area bounded by the respective Point array is obtained and compared against the value for “min_area”. If too small, then the loop continues to the next Point array. Point arrays passing this test are then used to draw a bounding rectangle around the contour in the original resized frame (“frame1”). Then depending upon one’s choice either this frame or the original full-sized frame (“frame”) is saved, but only if it passes one more test. In this application I wanted to screen out people walking by and this succeeded by requiring the number of points in the Point array to exceed 130. Finally, there is some code to restrict the saving of such frames to every one, every second, every third, etc. frames if practice shows too many frames are being saved. For example, in my application when a car appears I only care to save one image of the event. If the frame always has more than one contour large enough to save, then taking the mod of the running image index as shown can be used to limit the saving of images.

Next the resized “frame1” has text and time stamp added to it. If one wants to save it, it is done here. Note that the image index is always used as part of the file name.

	# draw the text and timestamp on the frame
	cv2.putText(frame1, "Car Status: {}".format(text), (10, 20),
		cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
	cv2.putText(frame1, datetime.datetime.now().strftime("%A %d %B %Y %I:%M:%S%p"),
		(10, frame1.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.35, (0, 0, 0), 1)
        if save_flag == 1:
        	cv2.imwrite("images/frame%d.png" % image_index, frame1)
        	save_flag = 0

Here we show all of the processed frames on the screen.

        cv2.namedWindow("Frame Delta")
        cv2.moveWindow("Frame Delta", 600, 500)
        cv2.namedWindow("Thresh")
        cv2.moveWindow("Thresh", 600, 50)
        cv2.namedWindow("firstframe")
        cv2.moveWindow("firstframe", 0,500)
        cv2.namedWindow("Thresh_no_dilate")
        cv2.moveWindow("Thresh_no_dilate", 1200, 50)
        cv2.namedWindow("gray")
        cv2.moveWindow("gray", 1200, 500)

	# show the frame and record if the user presses a key
	cv2.imshow("Security Feed", frame1)
	cv2.imshow("Thresh", thresh)
	cv2.imshow("Frame Delta", frameDelta)
        cv2.imshow("firstframe", firstFrame)
        cv2.imshow("Thresh_no_dilate", thresh1)
        cv2.imshow("gray", gray)
	key = cv2.waitKey(1) & 0xFF
 
	# if the `q` key is pressed, break from the lop
	if key == ord("q"):
		break
        if key == ord("n"):
                firstFrame = gray
                frame_cnt = 0
                print "firstFrame updated"

Now we arrive at the end of the while loop where the most important feature has been added. The background frame (“firstFrame”) is updated every so often. Every 20 frames was found sufficient for the application. Care is taken to only update it if it essentially contains no contours.

        #after 20 frames we want to update firstFrame but not if it
        #contains a vehicle or portion of a vehicle.  we will set it
        #to the present "gray" image unless we think the present 
        #gray image still has a vehicle in it.  we will update only 
        #if num_arrays=1 and num_points<20
        if frame_cnt >= 20:
                if num_arrays <= 1 and num_points <= 20:
                         firstFrame = gray
                         frame_cnt = 0
                         print "firstFrame updated"
        #end of loop
        end_loop_time = time.clock()
        print "outside loop time is " + str(end_loop_time - start_loop_time) + " seconds  " + "frame_cnt= " + str(frame_cnt)
# cleanup the camera and close any open windows
vs.stop() if args.get("video", None) is None else vs.release()
cv2.destroyAllWindows()

Pi Camera

Here is the particular camera assembly with telephoto lens used in this effort. With such a heavy lens the assembly needs to be sturdy.

PiCamera

Frame Capture of Vehicle

Below is an example of a full HD capture of a vehicle where the license plate is readable. Besides the use of a telephoto lens, the exposure time must be set short and the file must be saved as a lossless PNG.

Vehicle Detect

Depiction of Operation of the Code

This short video shows operation. The script was run on a Jetson TX1 with a CSI camera since it runs so much faster and smoother than operation on the Pi Camera.

Logged Output of Pi Camera

In the code that I have shown above the while loop time is clocked. This is quite long, ~>0.5 seconds, when running on the Raspberry Pi 3, Model B. In the portion of the output shown below there are a few things deserving to be pointed out besides the loop time.

  1. When the number of numpy arrays is zero, the loop continues to the next image.
  2. When two numpy arrays are found, the number of points in each array is checked and the array with 83 points passes the min_area test but the image is not saved since it does not meet the 130 point threshold.
  3. Later a frame is found with 7 numpy arrays (7 different white regions in the scene) and one of these has 585 points. This image is then saved.
  4. Lastly, the frame count (“frame_cnt”) rolls over after 19 when a new “firstFrame” is saved.
number of contours 2
number of numpy arrays in first contour 0
outside loop time is 0.588429 seconds  frame_cnt= 14
number of contours 2
number of numpy arrays in first contour 0
outside loop time is 0.597271 seconds  frame_cnt= 15
number of contours 2
number of numpy arrays in first contour 2
number of points in numpy array 13
number of points in numpy array 83
number of points in numpy array greater than min_area 83
outside loop time is 0.708429 seconds  frame_cnt= 16
number of contours 2
number of numpy arrays in first contour 7
number of points in numpy array 9
number of points in numpy array 15
number of points in numpy array 15
number of points in numpy array 28
number of points in numpy array 8
number of points in numpy array 64
number of points in numpy array greater than min_area 64
number of points in numpy array 585
number of points in numpy array greater than min_area 585
saving image number: 2
outside loop time is 0.559276 seconds  frame_cnt= 17
number of contours 2
number of numpy arrays in first contour 3
number of points in numpy array 110
number of points in numpy array greater than min_area 110
number of points in numpy array 28
number of points in numpy array 8
outside loop time is 0.560198 seconds  frame_cnt= 18
number of contours 2
number of numpy arrays in first contour 0
outside loop time is 0.57313 seconds  frame_cnt= 19
number of contours 2
number of numpy arrays in first contour 0
firstFrame updated
outside loop time is 0.521956 seconds  frame_cnt= 0
number of contours 2
number of numpy arrays in first contour 0
outside loop time is 0.585473 seconds  frame_cnt= 1
number of contours 2
number of numpy arrays in first contour 0
outside loop time is 0.533369 seconds  frame_cnt= 2

Final Comments

The code presented here worked well for the purpose for which it was designed. However, running it on the Raspberry Pi 3, Model B had rather slow loop time. The Pi is called upon to do lots of heavy processing with an intreprative language, Python. The video shows how much better the code runs on a much more powerful machine (i. e., Jetson TX1). The loop time for this same Python script is more than 10 times faster, an amazing 0.04 seconds! Subsequently, I recast this code in the form of a C++ program, compiled it on the Jetson TX1, and now use that to record motion events from my security cameras.

You may also like...

2 Responses

  1. Resahin says:

    I thank you very much dear friend. I could not stream esp32-cam via opencv, nor with ffmpeg, your solution of a line of code for gstream was my solution for these two there … hello dear guru

    • Frank says:

      Thank you for compliment. To solve that problem it took a bit of detective work using Wireshark comparing streams from the esp32-cam with streams from other cameras.

Leave a Reply

Your email address will not be published. Required fields are marked *