Contact Information:
University: Texas A&M University
Team Members: Jennifer Atkinson, Daniel Bogdanoff, Brian Civello, Dave Oyler, Bethany Smith
VISION
ECEN 404 – Senior Design Project
Texas A&M University, College Station, TX
I. INTRODUCTION
A. Project Objective
VISION provides a cost-effective alternative guidance system for the visually impaired. VISION does this by providing the user with information about their local environment that can be used to safely choose an unobstructed path to their destination. This allows blind users to navigate in uncertain environments without the use of sight.
B. Proposed System
The VISION system alerts a blind user of potential obstacles in their path. A stereo camera system, attached to a pair of glasses, sends images to a processing board. The images are then analyzed by software that detects nearby obstacles, such as benches, tables, and pedestrians. An array of vibration motors housed in a vest worn by the user vibrates in pulses depending upon the location and proximity of obstacles. The positioning of objects is indicated by the location of vibrations within the array. For example, an object on a user’s left side will result in vibrations on the left side of the array. The vibrations also provide information about the objects’ distances from the user. This is accomplished by varying the vibration of each motor’s duty cycle so that vibration pulse width increases as an object’s distance decreases. For example, a close object will result in vibration with a higher duty cycle, but a distant object will produce a pulsing vibration with a lower duty cycle. This combination of alert systems provides the user with enough information about the environment to allow them to choose an unobstructed path to their destination. Fig. 1 shows a high-level block diagram of the VISION system.
Fig. 1: System Block Diagram
II. Final System Description and Performance
The final VISION system allows a blind user to detect obstacles in their path and select an alternative route. A stereo camera system, mounted onto a pair of glasses, takes images simultaneously. These images are sent to a backpack containing the National Instruments SB-RIO 9602 board, where the images are processed. The RIO board runs a program designed in LabView, which uses selected image processing algorithms to detect the location, distance, and elevation of obstacles in the user’s path. Vibration motors on a vest then alert the user of the location and distance of the detected obstacles. The entire system is powered by a battery pack, which is stored in the backpack. The VISION system can be seen in Fig. 2.
Fig. 2: Final System
A. Camera System:
The camera system consists of two Linksprite cameras that transmit JPEG images over a serial UART connection. The cameras are housed in a box designed to keep each camera’s angle and direction consistent with the other camera. The box is securely mounted onto a pair of glasses. Wires from the cameras connect to an interfacing PCB which powers the cameras and relays the camera system’s images to the RIO board. The Linksprite cameras were selected because their small size allows them to be mounted onto glasses, and because JPEG images received from the cameras are already compressed into a relatively manageable file size. The cameras are able to send a new set of pictures to the NI RIO board every two seconds, and they have a 55 degree viewing angle. A two second delay between images is undesirable for the VISION system, but this is due to budget constraints. Many different cameras are available that can produce higher resolution images with less delay, and they are reasonably priced for a commercial version of a product such as VISION. However, for this demonstration of the VISION concept, the available funds were not adequate for more expensive cameras. A sample set of images from the stereo camera system is shown in Fig. 3. Both images received from the camera system show the same landscape, but from slightly offset positions.
Fig. 3: Stereo Camera Images
B. UART Transmitter and Receiver:
The NI RIO board receives images using a receiver designed in LabView. To receive images from the cameras, the board communicates using UART protocol. In UART, each system has a data transmission line dedicated to transmitting signals, and a data transmission line dedicated to receiving signals. The transmitter of one system is then connected to the receiver of another. A start bit and stop bit indicate the beginning and end of a sequence, so that the systems do not have to use the same clock (as long as they are set to the same frequency), making the system asynchronous. Because the NI RIO board does not have a serial port, one has been emulated using the FPGA GPIO ports available on the board. This is implemented purely in LabView. The transmitter converts an integer representing an 8 bit sequence into binary data that is padded with start and stop bits. The data is then sent over the transmission line, one bit at a time, at a specific time interval matching the baud rate of the system. The system was designed with the baud rate as an adjustable parameter so that it can be used in any system at a user-specified baud rate. At the receiver, falling-edge signal detection and UART specified delays are used to read incoming data. The start and stop bits are removed from the bit stream, and the remaining data bytes are sent to a first-in first-out (FIFO) buffer that is read by the waiting processor when the processor is available. This accounts for speed differences between the FPGA and the processor, and it prevents the FPGA from overwriting data when the processor is delayed.
C. Image Processing:
Once the images are received from the cameras, the RIO board begins to process the images. The operating principle of the VISION system is that an obstacle’s distance from a stereo camera system can be determined by simply finding the horizontal shift of an object between stereo images. This requires that the cameras have a fixed separation and common plane of view. In other words, if two images are taken at the same time by cameras that are slightly offset, the same object will be in a slightly different location in each image. A closer object will have a greater offset than an object in the background. The offset can be quantified by finding the number of pixels that the object has shifted, referred to as the pixel shift. The relationship between pixel shift and distance is given in Fig. 4.
Fig. 4: Pixel Shift vs Obstacle Distance
Fig. 4 shows that the pixel shift for a given distance becomes less pronounced as the object’s distance increases. The VISION system is designed to measure distances up to ten feet because beyond this point there is no discernible pixel shift with the resolution of the cameras that were chosen. Cameras with greater resolution could be used to provide greater range, but this would also come with a penalty of increased processing time.
1) Zone Splitting: In order to provide information about multiple objects and where they are different parts of the image must be processed separately. First, each image is split into twelve zones. Fig. 5 shows how the system divides each image. Once the images are split into zones, each image runs through the pattern matching algorithm to determine pixel shift. Multiple methods were tested for determining pixel shift, and the final design utilizes a method that balances performance and speed.
Fig. 5: Zone Split Image
2) Pixel Shift Calculation: The most basic method of determining pixel shift is to simplify each image so that it consists of only edges, and so that only the highest pixel intensity value, 255, and the lowest intensity value, 0, are allowed. These processes are known as edge detecting and thresholding, respectively. The two images are then combined and each pixel is added to its counterpart in the other image. The total sum of the pixels from the combined image varies depending upon the number of overlapping pixels because the largest possible intensity value is 255, and the sum of two pixels with high intensity is simply one pixel with high intensity. When the images are combined with the correct pixel shift, the sum of the pixels from the combined image will decrease drastically because a large number of the pixels will overlap. This can be seen in Fig. 6.
Fig. 6: Edge Detected, Thresholded Images
While this method is conceptually simple, it is computationally inefficient, and it requires a large number of image additions. The VISION system utilizes cameras with a resolution of 320 pixels in the horizontal direction. This requires image additions for 320 possible pixel shifts in each of the 12 zones for a total of 3840 image additions. The image addition function in LabView is also a time-intensive process. When this method was implemented on the NI RIO board, it required 5.1 seconds to process one set of images.
The second processing method tested decreased the processing time by removing all computations based on image data and instead transforming each edge-detected, thresholded image into an array of binary values. Combining the two images then amounts to nothing more than combining two arrays with a logical OR function which is much faster than image addition. This method also allows for increased access to individual pixels. This reduces the number of required calculations because in some cases zero padding is required on one of the two images, and a section of the array remains constant during shifting. These constant sections are removed and only the changing part of the image is summed. Overall, this removes 50% of the array elements, resulting in a decrease of 12800 elements per sum for each of the 320 possible shifts in each of the twelve zones. In total, this removed 49,152,000 individual element additions. However, finding the sum of the elements in the combined array still requires 6399 additions. Calculations must be made for 320 possible shifts in each of the 12 sub-image zones, so overall this method requires 24,572,160 individual sums. This method is faster than the first method, but when it was implemented on the NI RIO board, it still required 3 seconds to process one set of images.
3) Pixel Shift: Pattern Matching: LabView contains pattern matching functionality in its image processing toolbox. It was designed for use by factories to ensure that their manufactured items meet tolerance specifications. Pattern matching requires a reference image, usually of an ideal product, and compares the images of fabricated items to the image of the ideal product. The VISION system modifies and implements pattern matching in a novel way: to detect the pixel shift between two stereo images.
LabView’s pattern matching function finds edges, corners, and other points of discontinuity in a reference image, and searches a second image for a matching pattern. In VISION, each zone in an image is used as a template, and the pattern matching function locates possible matches in the other image. Once an object’s location is known in both images, its pixel shift can be computed with a simple subtraction of coordinate values.
Fig. 7 shows the output of the pattern matching function. The image on the right is from the camera on the right side, and the zone highlighted by the rectangle was used as a template. The image on the left is from the left camera, and the three rectangles were returned as possible matches in the right image.
Fig. 7: Pattern Matching Image Zone 5
The pattern matching function assigns a score (0-1,000) to each possible match based upon how well the detected pattern matches the reference. In order to identify the best choice out of the selection of possible matches, the VISION system takes into account both the score of the possible match as well as its vertical offset. Since VISION’s stereo cameras are only offset horizontally, there should be very little change in the vertical location of an object in the two images. The formula used for detecting the optimal match is given in (1) where yd is the detected vertical position, ye is the expected vertical position, ay is a normalizing factor for the vertical position, am is a normalizing factor for the match score, ky is the weight given to the vertical position parameter, and km is the weight given to the match score parameter. The difference in detected and expected vertical position is subtracted from 240, the total vertical resolution of the cameras, so that smaller offsets produce larger scores. The highest overall score is chosen as the best match. Through testing, it was determined that the optimal weighting parameters were km=18 and ky=11.
Overall Score = [(240-abs(yd-ye))*ay]*ky + (m*am)*km (1)
Fig. 8 shows the output of the pattern matching function for a zone where one of the possible matches was returned with a large vertical offset. As shown by the line pointing from the top corner, VISION’s algorithm ignored this false match and chose the correct match.
Fig. 8: Pattern Matching Image Zone 7
Since pattern matching chooses only the most important points within the reference image, it is the fastest of the three methods discussed. When it was implemented on the NI RIO board, it only required 2.6 seconds to process one set of images. However, this latency is still undesirable for use in a dynamic environment such as those experienced by VISION users. The latency problem could easily be solved by utilizing a multi-core processor or graphics processing unit that would allow for threading of the image processing algorithm. This concept was tested by networking the NI RIO board and VISION system with a dual-core laptop computer. This system was able to process all twelve zones in only 0.5 seconds. This is fast enough to allow a user to move at normal walking speeds. However, while processing is the bottleneck from a system-level point of view, in our implementation with budget constraints, the camera latency was the limiting factor.
4) Distance and Duty Cycle: Once the processing has been completed, the distance output for each zone is sent to the FPGA which takes the distance and converts it into a duty cycle. The duty cycle is based on the pixel shift of the obstacle. The table below shows the duty cycle given the pixel shift of an object. The duty cycle was found using (2).
(10 - Distance)*10=Duty Cycle (2)
Formula (2) assumes that all distances are rounded down to the nearest foot, e.g. an object at a distance of three feet, nine inches was viewed by the algorithm as being at three feet. This was done for both safety and simplicity.
D. Motor Switching Circuit
The vibration drive signal generated on the NI RIO is sent out of a GPIO pin. The output pins can only supply 3mA at 3.3V, but the vibration motors require 80mA. In order to supply sufficient power, the NI RIO’s 5V rail was used to power the motors. The circuit schematic can be seen in Fig. 9.
Fig. 9: Motor Switching Schematic
LED’s visually indicate which motors are vibrating for demonstration purposes. These LEDs could also be utilized for a training version of the VISION system for use with a therapist, but they would not be included on a commercial version of the VISION system. The power provided to the motors is passed through a simple LC filter to smooth fluctuations from the motors. The vibration motors are rated for operation at 3.3V, so pulse-width modulation at 30kHz is applied with a 66% duty cycle to reduce the average voltage applied without using a resistive voltage divider
E. PCB
A printed circuit board was created to hold the motor switching circuitry and interfaces for the NI RIO board, vibration motors, LEDs, and cameras. This circuit board can be seen in Fig. 10.
Fig. 10: PCB
F. Electronics Enclosure
An enclosure was created to house the NI RIO, Interfacing PCB, battery monitoring circuit, low power indicator circuit, and cooling fan. This enclosure also contained an external power connector, on/off switch, and panel-mounted connectors to allow for easier connection of peripherals
G. Vest Design
The final design of the vest was chosen to be extremely customizable. The vest has four rows where the wires and motors can fit. The user selects three of these four based on his or her own height, weight, sensitivity, and general comfort. Each row has a webbing strap that goes around the back and may be adjusted in the same way that a backpack strap does. The number and placement of the straps was experimented with. Originally there were only three straps, placed evenly along the length of the vest; it was later determined that a strap for each row attached to the end of each row was the most effective method to ensure maximum sensitivity. At first the straps were originally just going to be tied around the user, but later parachute clips were decided upon in order to make the vest easier to take on and off. Lastly flaps were attached to the vest in order to affectively hold the wires in place. The final vest can be seen in Fig. 11.
Fig. 11: Vest
H. Power
Because the cameras and motors are powered from the NI board, there is certain amount of power that is accounted for by the factors in the Table 1 [1]. These can be found in the NI RIO board manual. The board itself internally uses up to 6W of power, so the total power consumption is about 14.7W.
Item | Voltage | Current | Number used | NI RIO factor | Total Power |
Cameras | 5V | 90mA | 2 | 1.11 | 1W |
Motors | 5V | 80mA | 12 | 1.11 | 5.3W |
GPIO | 3.3V | 3mA | 20 | 1.18 | .24W |
Fan | 12V | 0.18A | 1 | 2.16W |
Table 1: Power Consumption of Components
The system is powered by a 25.9V battery pack which is rated for 2600mAH. The interface hardware for the motors is supplied by a 5V rail of the RIO board which also supplies power to the cameras. If all of the motors are running at their highest intensity, the battery will be able to last for approximately 4.5 hours. If half of the motors are running at full intensity, or all of the motors are running at half intensity, the system can last up to approximately 5.5 hours.
The battery includes a monitor with 5 LEDs that indicate the voltage level of the battery. An optical battery indicator is of no use for a blind user, so an audible low-battery alert has been constructed for the VISION system. A photoresistor is attached over one of the LEDs, and is connected to a circuit that alerts the RIO Board of a low battery. The board then sends a 700Hz, 3.3V signal through a speaker, alerting the user that the battery has approximately 20% left and needs to be recharged.
To trigger the low battery alert system, the voltage across the photoresistor is checked by the FPGA. The voltage must be between 0V and 3.3V, and the circuit is powered by 5V. When the LED is off, the photoresistor's value is typically over 50kΩ, but can reach well over 100kΩ. When the LED is on, the photoresistor’s value is around 1.4kΩ. By placing the photoresistor in parallel with a 32kΩ resistance and in series with an 18kΩ resistance, the proper voltages are achieved. When the light is off, the 32kΩ resistance dominates the photoresistor’s resistance, which triggers the low battery alert and keeps the voltage to an acceptable, non-damaging level. When the light is on, the photoresistor’s resistance dominates the 32kΩ resistance, and the voltage stays low. The board powers and checks this circuit every five minutes in order to save power. If the battery is detected to be low, the alert may be dismissed by pressing a button attached to the shoulder of the backpack. The button is attached to one of the NI RIO board’s GPIOs. The alarm will be re-triggered every five minutes until the battery is charged to an acceptable level or the system is turned off.
There are no major differences to the overall system between the original concept and the final product. However, the methodology used to process images did change significantly. Instead of mainly using edge detection, which costs more processing time and was less effective at identifying objects, a pattern matching scheme was implemented. This scheme yields much better results and takes less processing time.
The system has the ability to take pictures with two cameras and receive the images through the FPGA on the NI RIO board. This was accomplished by emulating a UART connection in the LabView software. The processor on the board is able to implement a LabView image processing program, developed by our team, which consistently detects objects and determines their distance from the user. The entire system is wearable and can be powered by a battery for several hours without charging. The final system accomplishes the original goals of the project.
GLOSSARY
GPIO - General Purpose Input/Output - pins on the NI Single-Board RIO that can be used to provide inputs to the processor/FPGA or to send outputs from the processor/FPGA
Parrallax - a difference in the apparent location of an object when viewed from slightly different locations. This allows the distance to the object to be calculated
NI RIO – National Instruments reconfigurable input/output board
Sensory Substitution- the use of one sensory system to input information into the parts of the brain that would otherwise process information from a different sensory system. For example, when using our product a person’s visual cortex will utilize tactile information from vibration motors to allow a user to “see” through their sense of touch.
Stereo Cameras - Two cameras with a known separation that can be utilized for parallax calculation synonymous with binocular cameras
VISION - Vertically Integrated Sight Independent Orientation and Navigation - our product which will serve as an alternative to the white cane and seeing-eye dog to aid visually impaired people as they walk
UART - Universal Asynchronous Receiver/Transmitter. A hardware device that is used for serial communications with computer peripherals
Acknowledgements
Special thanks to Eric Dean of National Instruments, who is responsible for the donation of the NI RIO board.
Works Cited
[1] User Guide: NI sbRIO-9601/9602 and NI sbRIO-9602XT Single-Board RIO OEM Devices, National Instruments, Austin, TX, 2010. p. 19. [Online], Available: http://www.ni.com/pdf/manuals/374991c.pdf
References
Batteryspace, How to Choose Battery [Online]. Available: http://zone.ni.com/reference/en-XX/help/371361H-01/TOC102.htm
LinkSprite JPEG Color Camera Serial UART Interface, LinkSprite, Longmont, CO, 2010. [Online]. Available: http://www.sparkfun.com/datasheets/Sensors/Imaging/1274419957.pdf
Society of Robots. Microcontroller UART Tutorial [Online]. Available http://www.societyofrobots.com/microcontroller_uart.shtml
National Instruments. (2011, June). VI and Function Reference [Online]. Available: http://zone.ni.com/reference/en-XX/help/371361H-01/TOC102.htm
NI VISION, National Instruments, Austin, TX, 2005. p. 71. [Online], Available: http://www.ni.com/pdf/manuals/371007b.pdf
Hi dwoyler,
Thank you so much for your project submission into the NI LabVIEW Student Design Competition. It's great to see your enthusiasm for NI LabVIEW! After reviewing your submission, we would like to contact you via email, but no contact email was entered on your submissions page! If anyone from your team can reach out to us, that would be great.
Good luck,
Saniya in Austin, Texas
Hi Dwolyer,
It's a great job!!! I would really like to do a project in this same line but also using audio to guide blind people.
Could you please attach the VIs you used to have some of them as models?
Thank you so much!
Regards,
Jessica