In recent years, 3D reconstruction technology, especially for mapping entire cities, has made great strides. This technology is crucial for detailed mapping and observation of cities. However, accurately capturing small objects like buildings from aerial images remains challenging. Traditional methods struggle to balance the entire city structure with fine details of building. A new technique, Neural Radiance Fields (NeRF), offers a way to create detailed scene views from set camera positions, but it is not efficient for large areas like cities. To solve this, we developed PatchNeRF. This method improves NeRF by focusing on specific areas of interest, allowing for more detailed and quicker results. PatchNeRF can repeatedly refine specific parts of a city model, like individual buildings, making it a big leap forward in creating detailed and efficient 3D city maps.
This study evaluates seven prominent SIFT implementations for feature detection in Wide Area Motion Imagery (WAMI): Lowe's archived code, VLFeat, OpenCV, SIFT anatomy, CudaSIFT, SiftGPU, and PopSift. We use spatio-temporal patch animations, termed ThumbTracks, to assess each method's performance in terms of jitter, wandering, and track switches. Additionally, we analyze the clustering of SIFT descriptors using t-distributed stochastic neighbor embeddings. Our results reveal significant variations in the performance of different SIFT variants, with implications for their suitability in various WAMI applications. We provide recommendations for selecting the most appropriate SIFT implementation based on feature stability, computational efficiency, and accuracy requirements.
Shadows in aerial images can hinder the performance of various vision tasks, including object detection and tracking. Shadow detection networks see a reduction in performance in mid-altitude wide area motion imagery (WAMI) data since they lack the related data for training. Aerial WAMI data collection is a challenging task, and the variety of weather conditions that can be captured is limited. Moreover, obtaining accurate ground truth shadow masks for these images is difficult, where manual methods are infeasible and automatic techniques suffer from inaccuracies. We are leveraging the advanced rendering capabilities of Unreal Engine to produce city-scale synthetic aerial images. Unreal Engine can provide precise ground-truth shadow masks and cover diverse weather and lighting conditions. We further train and evaluate an existing shadow detection network with our synthetic data to improve the performance on real WAMI datasets.
We explore an approach for vision-based GPS denied navigation of drones. We find SuperPoint/Superglue feature correspondences between two coplanar images: the drone image on the ground, and a satellite view of the flight area. The drone image is projected onto the ground using non-GPS data available to the drone, namely the compass and the barometer. Features on the drone image are projected back to the drone camera plane. Features on the satellite image are projected into 3D using a digital elevation map. The correspondences are then used to estimate the drone’s position. Drone coordinate estimates are evaluated against drone GPS metadata.
The demand for multi-dimensional reconstruction of cultural heritage images, archeological artifacts, and heritage sites has grown significantly, driven by factors such as climate change and the need to preserve endangered sites. Traditional approaches often rely on accurate and precise metadata information, which can be difficult to obtain and prone to errors. In this paper, we present an improved version of, an open, cross-platform, effective, and extensible GUI annotation tool for large photogrammetric imagery analysis. The enhanced EpiX features a more user-friendly interface, faster processing times, and an additional triangulation method that can be used alone or in conjunction with the existing epipolar method. This updated tool maintains its focus on exploiting the geometric features of epipolar lines while offering increased versatility and efficiency. We demonstrate the applicability of EpiX Enhanced for various research purposes, including ground truth collection and 3D distance measurement in high-resolution, high-throughput wide-area format video, also known as wide-area motion imagery (WAMI).
Fast, efficient and robust algorithms are needed for real-time visual tracking that could also run smoothly on the airborne embedded systems. Flux tensor can be used to provide motion-based cues in visual tracking. In order to use any object motion detection on a raw image sequence captured by a moving platform, the motion caused by the camera movement must be stabilized first. Using feature points to estimate the homography matrix between the frames is a simple registration method that can be used for the stabilization. In order to have a good homography estimation, most of the feature points should lay on the same plane in the images. However, when the scene has complex structures it becomes very challenging to estimate a good homography. In this work, we propose a robust video stabilization algorithm which allows the flux motion detection to efficiently identify moving objects. Our experiments show satisfactory results when other methods shown to fail on the same type of raw videos.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.