Curating flood extent data and leveraging citizen science for
benchmarking machine learning solutions
Abstract
We present a labeled machine learning (ML) training dataset derived from
Sentinel 1 C-band synthetic aperture radar (SAR) data for flood events.
In this paper, we detail the steps to collect, pre-process, label,
curate, and catalog the training dataset. Development of benchmark ML
models and usage of the training datasets for a data science competition
are also presented.