The selection of the benchmark network in self-supervised monocular depth-based estimation can often only be made using previous networks to select the best performers among them. When there is a change in resources and want to scale the network, it is difficult to find a suitable way to adjust the network quickly if the selected network itself does not give the same series of networks of different sizes. In this paper, we investigate whether the network generated by the Neural Architecture Search method based on search parameter scaling has good robustness in monocular depth estimation based self-supervised, as which the pose estimation network as well as the depth estimation network can have a better improvement in the accuracy of depth estimation. The final experiments show that the generated series perform well on the KITTI dataset, with the best performing EfficientNet-B3 outperforming all previous self-supervised networks.
Multi-View Stereo (MVS) is a technology that reconstructs the three-dimensional structure of objects or scenes through 2D images and camera parameters. PatchMatch based MVS methods are widely used nowadays. However, these algorithms have two weaknesses: cross-correlation based similarity measurement methods becoming ineffective on regions that are texture-less (e.g. white wall) or have stochastic textures (e.g. grass), as the similarity measurement method such as NCC could not perform well on these regions; Sometimes, the reconstructed result may be stuck into global optimal which heavily drifts from the ground truth because of noise, occlusion and etc. To tackle these issues, we present an MVS pipeline called Adaptive Pixelwise Inference Multi-View Stereo (API-MVS). First, a strategy is proposed to adaptively infer the matching window sizes of each pixel, making the reconstructed results on texture-less or stochastic texture regions have a better trade-off between accuracy and completeness. Second, a new cost function is proposed to integrate the matching cost values computed using different neighboring images, and experiments confirmed that the cost function we used can make locally optimized results closer to the ground truth. We have tested our algorithm on ETH3D benchmarks. The result shows the effectiveness of our method, and it is comparable to state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.