In this paper, we propose the octave deep plane-sweeping network (OctDPSNet). OctDPSNet is a novel learning-based plane-sweeping stereo, which drastically reduces the required GPU memory and computation time while achieving a state-of-the-art depth estimation accuracy. Inspired by octave convolution, we divide image features into high and low spatial frequency features, and two cost volumes are generated from these using our proposed plane-sweeping module. To reduce spatial redundancy, the resolution of the cost volume from the low spatial frequency features is set to half that of the high spatial frequency features, which enables the memory consumption and computational cost to be reduced. After refinement, the two cost volumes are integrated into a final cost volume through our proposed pixel-wise “squeeze-and-excitation” based attention mechanism, and the depth maps are estimated from the final cost volume. We evaluate the proposed model on five datasets: SUN3D, RGB-D SLAM, MVS, Scenes11, and ETH3D. Our model outperforms previous methods on five datasets while drastically reducing the memory consumption and computational cost.

Published in: IEEE Access ( Volume: 7)

Page(s): 150306 – 150317

Date of Publication: 14 October 2019 

Electronic ISSN: 2169-3536

INSPEC Accession Number: 19087399

DOI: 10.1109/ACCESS.2019.2947195

Publisher: IEEE

Source code: https://github.com/matsuren/octDPSNet