cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
jamosp2
Visitor
Visitor
298 Views
Registered: ‎08-10-2018

Stereo Vision in Vitis Semi-Global Block Matching SGBM

I'm getting some poor quality outputs from the Vitis Vision Library's SGBM.

I am running on a custom board based on a Trenz TE0808 SOM, which is an UltraScale+ ZU9EG device. I'm using Ubuntu 18.04, Vitis 2019.2.

I have built the sgbm example from the Vitis Vision Library i.e.:

~/.Xilinx/Vitis/2019.2/vitis_libraries/vision/L1/examples/sgbm

Using the provided example "left.png" and "right.png" dataset, the output seems reasonable:

jamosp2_0-1610039673477.png

When I use my own example images, using my own ZED Stereo Camera (which is the same as used by Xilinx to capture the example images), I'm getting results that look unusable, particularly when I use the higher resolution options i.e. 1080p or 2K

I took a sample image using my ZED Camera, using the camera's tools to save a rectified image at 720p, 1080p and 2K resolutions. I also exported the depth map calculated on the laptop GPU by the ZED software, which you can see as ZED Truth below. The result for 720p seems reasonable, but then as the resolution is increased, there is a lot of noise in the foreground and background of the output image. This is still using the default settings of the demo i.e. NUM_DISPARITIES = 64, p1 = 20 and p2 = 40.

jamosp2_1-1610041363535.png

I notice that the example images provided in the Vitis example folder have a pronounced rectification effect, with the image warped at the top, left and bottom boundaries. I do not get anything like this level of image warping when I either extract rectified images from the ZED software, or use the ZED example Python OpenCV to grab images from my camera and save rectified images. Is there something different that needs to be done for the hardware kernel that I am not aware of? But if that is the case, I don't know why the 720p example that I captured works well.

I also tried one of the Middlebury datasets and noticed that as the baseline increases, the result eventually becomes overwhelmed with noise. The Middlebury "Art" dataset provides 5 views, spaced progressively further apart. Note that I did increase the number of disparities and re-synthesise before I ran this dataset.

Art_Comparison_of_View_Results.png

I've been going round in circles on this for a long time. Has anyone had any experience with this kernel?

0 Kudos
1 Reply
akashsun
Xilinx Employee
Xilinx Employee
96 Views
Registered: ‎07-31-2018

Hi jamosp2, as you had used the higher resolution image, you are suggested to increment the NUM_DISPARITIES value accordingly. Have you tried with doubling the disparity of say 128, for the 1080p image? As the rectification goes, as long as the left and right images satisfy the epipolar constraint, that is all we require. The warping of images are effect of the algorithm which is being used to perform rectification. The noise is something that we can overcome with some filtering of the output disparity values, such as median blur - less efficient but better performing, speckle filter - highly efficient in terms of refinement of the image, but slower in terms of performance. That been said the view 5 of the Middlebury dataset looks fishy, which might need some debug in terms of the reason for more noise. A heuristic approach for the parameters p1 and p2 based on the environment is also another solution, which might give you a better result.  

0 Kudos