For our ECE 5725 Design with Embedded Operating Systems final project, we created an autonomous object tracking turret. Our turret is able to locate blue objects in real time and autonomously track it with two degree freedom of motion (rotation and tilt). Our system can also be remotely controlled to emit a laser beam and to fire a rubber band at the target.
Inspired by the surface-to-air missile in the movie Olympus Has Fallen and excited by the influx of the useful applications of computer vision, we wanted to create a project that seamlessly integrates both elements. This fascination led to our creation of the autonomous object tracking turret. At a high level, the mini computer Raspberry Pi serves as the brain of our turret. The Pi processes the images captured by the camera module to detect the target object and then outputs feedback control signals that move the servos to follow the object. Furthermore, when the Pi receives keyboard commands from the keyboard, it can fire a laser beam and/or a rubber band to the target object.
The hardware design of our turret consists of a Raspberry Pi, a pan tilt kit, a camera module, a laser, and a rubber band projectile mechanism. We first started with building the pan-tilt kit. The pan-tilt kit is nicely packaged kit that allowed us to get the desired motion we wanted for our turret. It consisted of all the screws, plastic housing, and two micro servos (with feedback) that can each move 180 degrees. Once put together, we quickly began to test and control each of the motors, one for rotation, and one for tilt. Interestingly, despite the popularity of these micro servos, we were not able to find a reliable datasheet that specified the operating frequencies of these motors, only the operating periods were provided. Thus, we started testing these servos with the same frequency as the Parallax servo used in class, which was 50 Hz. Surprisingly, the frequency used in our previous servos worked for these servos also. The data sheet specified 1.5ms period on as the center value, 2msec as all the way left, and 1 msec all the way right. Using these values and a 50 Hz period, the duty cycles were calculated to be between 7.5% and 10% for full range of motion (1/50*period). However, when we were actually testing the motors, we determined that these duty cycles, given in the datasheet, gave us less than the 180 degree specified. Therefore, after further experimentation, we determined that the duty cycle range that would give us the exact 180 degree motion is between 3.5% for full right and 13% duty cycle for full left. Due to the limitation of the Pan Tilt kit, in which the tilt motor was limited to only around 130 degrees of motion by the physical plastic; we determined the appropriate duty cycle range of the tilt servo to be from 4% to 11%.
Once we got the pan tilt working with the servos moving to the desired angle, we integrated the servo control with the camera to allow for the autonomous tracking of an object. As seen in the image above, the camera module is directly attached to the pan kit such that both the rotation and tilt movements move the direction the camera is pointing at, giving us the ability to view the full half of the room. We decided to use rubber bands to secure to camera because not only were rubber bands strong enough to hold the camera in place, but it also allowed for minor adjustments when needed. In addition to the camera, we also mounted a laser pointer on one side of the turret that has a control line which allowed the laser to be controlled via software; though, the range of the laser was a mere 5 feet. Another added feature of our turret, which was a distinguishing feature for our project, was a DIY rubber band shooting mechanism. To weaponize our turret, we hot glued a ,third, continuous rotation servo on the back of the turret as well as a grated chopstick on one side of the turret to serve as our weapon. On command, the servo would rotate, allowing the rubberband to be shot at the desired rotation and tilt angle. The continuous parallax servo used to operate the firing mechanism operated at 1000 Hz with 1.5% duty cycle as full clockwise rotation.
The software design of our system is divided into three main portion: object recognition algorithm, PID control algorithm and multi processing algorithm.
Object Recognition Algorithm
Our object recognition algorithm detects the largest blue object viewed on camera and extracts its center position. This algorithm is made easy with the openCV library. To start, we first installed the openCV library onto our RPi. There are several ways to install openCV. We decided to install openCV directly from RPi Ubuntu’s application repository by entering the following command in the command prompt (sudo apt-get install libopencv-dev python-opencv). Though the application repository openCV is only at version 2.4.9 (the latest version is 3.4), we avoided several hours of installation time and got openCV up and running in several minutes; the 2.4.9 version came equipped with enough functions we would like to use. As for the specific implementation, our object recognition algorithm can be broken down in the following steps.
- cv2.VideoCapture(0) is first used to create a video capture object that would return a 640x480x3 picture array whenever videoCap.read() is called. The picture array is 640×480 because we are using the default image resolution and it has a depth of 3 because it has three colors at RGB(red, green, blue) color space.
- cv2.cvtColor is used to convert the picture array from RGB color space to HSV (hue, saturation, value) color space because it is easier to detect a color at HSV than RGB color space.
- cv2.inRange is used to check pixels in the picture array that is in the blue range and creates a mask. It does so by checking individual pixel and compares it with the range. Before we feed in the mask to the following time intensive steps, we sum the values of the mask to check if it is greater than a certain threshold. If yes, we treat it as a valid blue object detection and move on. But if it is below the threshold, we simply skip the following steps and discard it as noise to avoid unnecessary heavy computation.
- cv2.morphologyEx is called twice to apply the open and close operation on the mask, which is useful to get rid of a lot of background noise, separate connecting objects and solidify an object with its hazy boundaries.
- cv2.findContours is used to extract the boundary of the objects.
- Once we have the boundary of an object, cv2.moments is used to extract the spatial moments that can be used to compute the center position of an object. Since we only track one object at one given time, we loop through all the contours we obtained from cv2.findContours to find the object with the max contour and we only pass the max object to cv2.moments for center extraction.
PID Control Algorithm
The center position of an object extracted from the object recognition algorithm is our target position. We know that our turret’s current position is the center of the camera (320, 240), which is half of the image resolution 640×480. With the target position and current position, we can compute their difference and use it as a feedback to control the servo and make the camera align with the target, thereby achieving tracking. A simple method to implement tracking is to move the camera at a constant slow rate to approach to the target. For example, if the turret detects the object is at the top right corner relative to its current position, it simply moves the servo incrementally to the right and incrementally to the top. With successive small constant movement, the turret would eventually get to the target. However, evident in the description above, this method is not efficient as the servo would move at a constant rate regardless of how close or far the turret is to the target. Thus, to optimize movement, we decided to implement a PID control algorithm that would optimize the motor movement such that the desired target is reached faster.
The PID in a PID controller stands for proportional, integral, and derivative. Each part of the controller plays a role in controlling the motors such that the motors will not only move to the desired location quickly, but also with minimal oscillations. Following some resources recommended by Professor Skovira, we started with simply applying the proportional part of the controller. The proportional part, arguably the most intuitive, simply adjusts the speed/strength of adjustment proportional to the error. In other words, the further away the target is from the center of the frame, the faster the motors will try and adjust. Similarly, the closer the target is to the center, the slower the motors would adjust; we controlled the “strength” or speed of the motor by changing the size of the increment the motor moves at each time step/each loop. While the proportional part of the controller is effective, only having the proportional part caused some oscillation when tracking, thus, again following the same resource outlined below, we continued and incorporated the derivative portion into our system. The derivative portion of the controller allows for the system to tune down the change in velocity such that the system does not overshoot as much each iteration, dampening the oscillations quicker. The basics of the derivative portion are calculated as follows:
- Take the current error (the x -y distance from the center of the frame to the target) and subtract it by the previous error at the last time step
- Divide the result by the time it took between the two captures; that is your derivative term.
Evident in the math, the derivative takes the change in error in account to determine how fast the motor should adjust. If the motors are accelerating to the target, it should be slowed down to eliminate oscillations from the proportional control. Once we added in the derivative portion, our system was able to move and track the target quickly with minimal oscillations and errors, thus forgoing the need for an integral portion. The size of the motor adjustment are modeled with the following equation: Control speed = Kp* proportional + Kd * derivative. The constants Kp and Kd are user decided constants that are system dependent. Further explanation regarding how we chose our constants is further explained in the testing portion below.
Multi Processing Algorithm
The last portion of our software design is the multi processing algorithm. Our object recognition algorithm takes about 120ms to execute, which means it is displaying the video at 8 frames per second. Since a human eye can perceive a noticeable delay in the video when it is below 15 frames per second, there was initially significant delays in our system in terms of both motor control and the display on the screen. To minimize the processing delay and allow videos to display with no perceivable delay, we had to optimize the usage of all four cores on the Raspberry Pi to perform multiprocessing. To do this, with the help of Professor Skovira, we came up with an algorithm in which we assigned one process as the master process and the three processes as worker processes. The master process performed steps 1-3 of the object recognition algorithm stated above: it grabs a frame and checks if the frame has blue object before sending it to a queue in which one of the three worker processes will perform computation intensive open/close operation and contour extraction; the three processes will rotate to grab frames from the queue and can operate in parallel, thus effectively decreases overal computation time. Once processing is done for one frame, the master process will extract the contours sent back from the worker process, display the contours on the screen, and use the contours for PID servo control. This way, the master process grabs and displays the frame independently on the screen, allowing the video streaming from the camera to have no visual delay. Each of the three worker processes can process a frame in 150ms, and since we are processing three frames at the same time now, on average, our processing time becomes around 50ms. The video below shows an example of a frame going through the processing algorithm with one worker process.
The video below shows an example of a frame going through the processing algorithm with three worker processes. Comparing with the video above, we can see that the blue contour is drawn smoother with more worker processes.
Testing, Design Issue, and Discussion
Our turret was built in an incremental design and testing procedure. First, we worked on getting the pan tilt and servo control up and running so we could get our turret to move the range of motion we desired. Initially, we were using the RPi GPIO library to generate software PWM signals to control the servos. We quickly realized that the software created PWM signals were noisy and unstable in a RPi multiprocessor operated environment. The noisy software PWM signals did not control the servos stably and the servos jittered a lot even at their standing still positions. After consulting with Professor Skovira, we decided to change our approach and controlled the servos using hardware PWM with the pigpio library instead. To use the hardware PWM library, we have to first use the “sudo pigpiod” command to launch the pigpio library as a daemon before running our main code. More specifically, we actually entered the command “taskset 0x1 sudo pigpiod” so the daemon was bounded to run at the first core of the RPi. The hardware PWM generated (in pin 12 and 13) were much cleaner PWM signals and significantly reduced the amount of jitter created on the servo. Once we resolved the jitter, we then worked on figuring out the control signals using modular functions that can move the turret to the desired angle.
In parallel with working on the pan tilt kit and servo control, we also worked on object detection with the openCV library. After openCV was installed, we decided to run a sample code we found online to test it, but we received an error from openCV and couldn’t get it to run the sample code. The error says the following: “OpenCV Error: Assertion failed (scn == 3 || scn == 4) in cvtColor, file /build/opencv-U1UwfN/opencv-22.214.171.124+dfsq1/modules/imgproc/src/color.cpp, line 3737”. We initially thought this error is due to the fact that our openCV version was low and we might have to compile and update it to the latest version of openCV. But after a quick run down of google search, we found out that the error can be easily fixed by entering the following to the command line “sudo modprobe bcm2835-v4l2“. This command basically installed the V42l driver or the Video4Linux2 API on the Bcm2835, which was the Broadcom chip used in RPi B+. We also added this command to /etc/modules so it was automatically loaded at boot time. Once we got openCV running, it was smooth to do blue object detection because of the great community support with abundant sample tutorials and the intuitive openCV library that abstracts a lot of details into a single line of function. We were able to extract the center of a blue object in about 10 lines of code. We tested our detection algorithm in different light condition and found that it worked best when the room wasn’t overly bright as brightness caused the camera to re-auto focus. During testing, we discovered and fixed some software bugs in our code, such as a bug that crashes the code when dividing by 0 and a bug that stops object detection due to wrong timing calculations.
Once we had the pan tilt and the object detection working, we then merged the two and used the object detection to output feedback values to control the pan tilt with our PID control algorithm. To tune our PID, we started with setting the integration and derivative terms to be zeros. We settled for a proportional term that responded fast enough and gave a little bit of overshoot, which we found to be 3. We then increased the derivative term until the overshoot oscillation was minimized, which we found to be around 0.005. We didn’t tune the integrational term because the proportional and derivative terms were good enough and our target moved around over time.
Our last step was to add the laser and rubber band fire feature onto our turret. The last step was trivial on the software side. We simply used two gpios: one gpio is toggled whenever the “v” button is pressed to toggle the laser and the other gpio sends out a servo signal that rotates the servo for a second and triggers the rubber band to be fired. On the hardware side, we realized the rubber band fire mechanism we added have some design flaws. For example, the chopstick we used to hold the rubber band was quite long. When our turret looked downward, the chopstick would touch the ground and lift our turret, so we added a wood base to increase the height of our turret to resolve this issue. We also realized in hindsight that we should have mounted the rubber band servo, the servo that controls the rubber band fire mechanism, on top of the tilt servo instead of the left-right rotational servo. We currently mounted the rubber band servo on top of the left-right rotational servo and it can interfere with the movement of the tilt servo when the tilt servo goes too upward. Lastly, we originally planned to load four rubber bands on our turret so it can be fired four times. Later, we realized this wasn’t feasible with our current design because the pan tilt servos we were using couldn’t generate enough torque to hold the pan tilt in place. In fact, it couldn’t even hold two rubber bands, as two rubber bands would generate great enough force to move the tilt servo. We could technically do it but it would come at the expense of losing 1 degree of freedom at the up-down position.
Our autonomous object tracking turret successfully achieved the three core objectives we set forth at the brainstorming stage of our final project:
- Our turret can successfully locate the largest blue object on view in real time with around 50ms average processing latency (or operating around 20fps on average). This was achieved through the multiprocessing library and the usage of four processes (one master process and three worker processes). The master process grabs and displays video frames independently and also sends frames to the three worker processes over a queue for them to perform time intensive open/close operation and contour extraction.
- Our turret can successfully extract the blue object’s target position as feedback parameters for our PID control algorithm, which then controlled the pan tilt servos to track the blue object and keep the blue object align with the center of our camera. Our PID control algorithm was tuned to minimize tracking time and maximize the efficiency of tracking an object with minimal oscillations.
- Our turret can successfully be remotely controlled with a keyboard to emit a laser beam and to fire a rubber band at the target.
Despite the success of our project, there are many aspects and future works that can be done to further enhance our turret. Despite distributing the image processing using the Multiprocessing library, we still experienced some latency with the object tracking, which lead to slight delays in object tracking. Since we still could not guarantee that each core was designated to process each frame with the Multiprocessing library. Further research and experimentation can be done, either continuing the use of the Multiprocessing or using FIFO’s, to truly allow our program to be operating in parallel, which will give a 3x speedup. Nevertheless, another issue we ran into was the lack of torque on the servo when the rubberband was attached. The turret’s downward tilt movement was limited by the rubber band when it was loaded due to the pull strength of the rubberband. An easy fix would simply be incorporating more powerful servos with feedback. Another more robust solution is to find a way to also attach the firing motor on the tilt side of the pan tilt kit. In this method, no extra stress on the motors or rubber band would be added when the servos moved, thus allowing the servo’s in both direction to move freely. Obviously, more advanced weaponry can be added to our project, but maybe we’ll leave that to Lockheed Martin. Lastly, another area of improvement is the image processing portion. Despite our successes in extracting a blue object in the frame, more complicated and robust object detection can be developed to further enhance our targeting abilities. Instead of simply shooting a blue object, we can develop facial recognition and tracking software to track down and fire at Professor Skovira.
Overall, our final project was a huge success. With five weeks of time, we were able to brainstorm, design, and successfully create a mini version of a surface-to-air missile that was able to locate and take down, with a rubber band, a blue object in real time autonomously. Through working on this project, we not only applied many skills we have learned in previous labs and in lecture, but we also gained numerous insights from the difficulties we faced throughout. From the lack of torque in the servos, the need to distribute image processing tasks amongst the different cores, and implementing a PID controller, we have learned invaluable skills that include not letting the operating system decide what to do and how to do it, especially in real time systems. We hope this project will serve as a good baseline for any future groups looking to develop an autonomous computer vision system featuring complicated hardware. Both Xitang and Francis feel extremely accomplished and hope to have this project featured on Hackaday and/or other publications.