There has been some interest in the new stereo vision library that is used in the Labview Vision Development Module 2012.
I was (and still am) interested in testing the performance of the library as are some other people on the NI forums that are having problems obtaining a proper depth image from a stereo vision system.
In this post I will present my results that I obtained so far and also attach a small VI, that can be used to calculate the parameters of the stereo setup. Based on this THEORETICAL calculations it is possible to obtain the disparity for a certain depth range and also calculate the depth accuracy and field of view.
First off, I tried to test the stereo library with the setup shown in Figure 1.
Figure 1. Setup for the initial test of the stereo library.
Two DSLR cameras were used with the focal lenghts of both set to 35 mm. The cameras were different (different sensors, different focal lenghts), and the setup was really bad (the cameras were attached to the table using a tape). Also note that the NI Vision prefers a horizontal baseline, or a system where the horizontal distance exceeds the vertical distance between the two cameras (NI Vision Manual). Ideally, the vertical distance between both sensors should be equal to zero and the horizontal distance separated by the baseline distance. I selected the disparity according to the parameters calculation (see the attachment) and I really was not expecting great results, and sure enough, I didn't get them.
So, I changed my stereo setup. I bought two USB webcameras and rigidly mounted them as shown in Figure 2. The system was further mounted on a stand.
Figure 2. Stereo setup with two webcameras with baseline distance of 100 mm.
I followed the same procedure (calibrating the cameras (distortions) and the stereo system (translation vector and rotation matrix)) and performed some measurements. The results of the reconstructed 3D shape are shown in Figure 3 (with texture overlaid).
Figure 3. 3D reconstruction with overlaid texture.
The background had no texture (white wall), so I expected that the depth image of the background would not be properly calculated, but the foreground (the book) had enough texture. To be honest, I was dissapointed as I really expected better results. In order to enhance the resolution the object needs to be closer or the the focal lenght needs to be increased.
But before I tried anything else, I took the example that came with the stereo library and calculated the 3D cloud of points (similar as in Figure 3). The results are shown in Figure 4.
Figure 4. The 3D reconstruction with texture for the supplied LV stereo example.
To be honest again, I was expecting better results. I am confident that this setup was made by experts with professional equipment and not some USB webcameras. Considering this, I am asking myself the following question: Is it even reasonable to think that any good results can be achieved using webcameras? Looking at the paper titled "Projected Texture Stereo" (see the attached .pdf file) it would also be interesting to try using a randomly projected pattern. This should improve the algorithm when matching the features of both images and consequently make the depth image better and more accurate.
I really hope that someone tries this before I do (because I do not know when I will get the time to do this) and shares the results.
If anyone has something to add, comment or anything else, please do so. Maybe we can all learn something more about the stereo vision concept and measurements.
Be creative.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.