Super-resolution is crucial in computer vision and digital image processing, aiming to enhance low-quality images’ resolution and visual quality. This paper focuses on correcting the distortion introduced by fisheye lenses and improving the resolution of images for better detail representation. Specifically, we propose an evaluation approach that benchmarks three state-of-the-art models in different categories: Real-ESRGAN (convolutions), SwinIR (transformers), and SR3 (diffusion). We evaluate their performance in super-resolution and distortion correction tasks using metrics such as PSNR and SSIM. To facilitate this evaluation, we create and release a new dataset of lunar surface images with fisheye distortion applied. Our experiments demonstrate the effectiveness of each model in handling distortion and improving image resolution. The results show that large models generally outperform medium models, and PSNR models achieve higher PSNR and SSIM scores than GAN models. Additionally, we evaluate the distortion correction by comparing the corrected images with ground truth. Our findings contribute to understanding different model categories and their performance in super-resolution and distortion correction tasks. The proposed dataset and evaluation approach can be valuable resources for future research.
Recent studies have shown that Convolutional Neural Networks (CNNs) achieve impressive results in crop segmentation of Satellite Image Time-Series (SITS). However, the emergence of transformer networks in various vision tasks raises the question of whether they can outperform CNNs in crop segmentation of SITS. This paper presents a revised version of the Transformer-based Swin UNETR model adapted specifically for crop segmentation of SITS. The proposed model demonstrates significant advancements, achieving a validation accuracy of 96.14% and a test accuracy of 95.26% on the Munich dataset, surpassing the previous best results of 93.55% for validation and 92.94% for the test. Additionally, the model’s performance on the Lombardia dataset is comparable to UNet3D and superior to FPN and DeepLabV3. Experiments of this study indicate that the model will likely achieve comparable or superior accuracy to CNNs while requiring significantly less training time. These findings highlight the potential of transformer-based architectures for crop segmentation in SITS, opening new avenues for remote sensing applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.