Improved Automated Building Extraction from High Resolution Remote Sensing Imagery using Time-optimized Deep Learning Techniques

Chintan Maniyar; Minakshi Kumar

doi:10.1002/essoar.10509584.1

loading page

Improved Automated Building Extraction from High Resolution Remote Sensing Imagery using Time-optimized Deep Learning Techniques

Chintan Maniyar,
Minakshi Kumar

Abstract

Deep learning techniques are being increasingly used in earth science applications - from climate change modelling to feature extraction from remote sensing imagery, given their advantage of increased contextual and hierarchical feature representation. However, deep learning comes at an expense of extensive computational resources and long training time to achieve benchmark results. This study suggests time-optimized deep learning techniques for training deep convolutional networks for one of the most sought after feature extraction subsets – building extraction from satellite/aerial imagery. Building extraction is one of the most important tasks in the dynamic pipeline of urban applications such as urban planning and management, disaster management, urban mapping etc. among other geospatial applications. Automatically extracting buildings from remotely sensed imagery has always been a challenging task, given the spectral homogeneity of buildings with the non-building features as well as the complex structural diversity within the image. With the availability of high resolution open-source satellite and UAV data, deep learning techniques have greatly improved building extraction. However, training on such high resolution data requires the networks to be significantly deeper, resulting in long model training and inference times. This study proposes a combination of two time efficient methods to train a Dynamic Res-U-Net for building extraction in less time without decreasing the training parameters: 1) Using Cyclical Learning and SuperConvergence concepts by dynamically changing the learning rate while training the network to achieve very high accuracy in very less time and 2) Using a specific order to train the layers of the network(s) to specially have the last layers of the networks perform better, leading to an overall improved network performance in lesser time. Building extraction results are gauged using the metrics of Accuracy, Dice Score and Intersection over Union (IoU) and F1-Score. The metrics comparison of training the Res-U-Net in the conventional way vs the proposed techniques shows an evident optimisation in terms of time. Better results are achieved in lesser training epochs using the proposed time-optimised training techniques.