Leveraging Machine Learning and Twitter Data to Identify High Hazard Areas during Hurricane Harvey

Antonia Sebastian; Wesley HighField; Samuel Brody; William Mobley

doi:10.1002/essoar.10501349.1

loading page

Leveraging Machine Learning and Twitter Data to Identify High Hazard Areas during Hurricane Harvey

Antonia Sebastian,
Wesley HighField,
Samuel Brody,
William Mobley

Abstract

The timely understanding of flood extent is critical information for emergency managers during disaster response. Search and rescue operations require a boundary to reduce fruitless efforts and to prioritize critical or limited resources. However, high-resolution aerial imagery is often unavailable or lacks the necessary geographic extent, making it difficult to obtain real-time information about where flooding is occurring. Volunteered geographic information (VGI) is a subset of crowdsourcing and can be used to disseminate spatially relevant information or request help. In this study, we present a novel approach to map the extent of urban flooding in Harris County, TX during Hurricane Harvey (August 25-31, 2017) and identify where people were most likely to need immediate emergency assistance based on a subset of crowdsourced SOS requests. Using the machine learning software Maximum Entropy (MaxEnt), we predict the spatial extent of flooding based on several physical and socio-economic characteristics. We compare the results against two alternative flood datasets available after Hurricane Harvey (i.e., Copernicus satellite imagery and fluvial flood depths estimated by FEMA), and we validate the performance of the model using a 15% subset of the rescue requests and Houston 311 flood calls. We find that the model predicts a much larger area of flooding than was shown by either Copernicus or FEMA when compared against the locations of rescue requests, and that it performs well using both a subset of rescue requests (AUC 0.917) and 311 calls (AUC 0.929).