Leveraging Machine Learning and Twitter Data to Identify High Hazard
Areas during Hurricane Harvey
Abstract
The timely understanding of flood extent is critical information for
emergency managers during disaster response. Search and rescue
operations require a boundary to reduce fruitless efforts and to
prioritize critical or limited resources. However, high-resolution
aerial imagery is often unavailable or lacks the necessary geographic
extent, making it difficult to obtain real-time information about where
flooding is occurring. Volunteered geographic information (VGI) is a
subset of crowdsourcing and can be used to disseminate spatially
relevant information or request help. In this study, we present a novel
approach to map the extent of urban flooding in Harris County, TX during
Hurricane Harvey (August 25-31, 2017) and identify where people were
most likely to need immediate emergency assistance based on a subset of
crowdsourced SOS requests. Using the machine learning software Maximum
Entropy (MaxEnt), we predict the spatial extent of flooding based on
several physical and socio-economic characteristics. We compare the
results against two alternative flood datasets available after Hurricane
Harvey (i.e., Copernicus satellite imagery and fluvial flood depths
estimated by FEMA), and we validate the performance of the model using a
15% subset of the rescue requests and Houston 311 flood calls. We find
that the model predicts a much larger area of flooding than was shown by
either Copernicus or FEMA when compared against the locations of rescue
requests, and that it performs well using both a subset of rescue
requests (AUC 0.917) and 311 calls (AUC 0.929).