Evaluation of Text Mining Techniques Using Twitter Data for Hurricane
Disaster Resilience
Abstract
Data obtained from social media microblogging websites such as Twitter
provide the unique ability to collect and analyze conversations of the
public in order to gain perspective on the thoughts and feelings of the
general public. Sentiment and volume analysis techniques were applied to
the dataset in order to gain an understanding of the amount and level of
sentiment associated with certain disaster-related tweets, including a
topical analysis of specific terms. This study showed that disaster-type
events such as a hurricane can cause some strong negative sentiment in
the period of time directly preceding the event, but ultimately returns
quickly to normal levels. An analysis of the volume of tweets during the
same time revealed that the public responds in near real-time to events
with conversation on Twitter. This information can be an effective tool
in which to arm emergency management personnel with vital human
intelligence information to inform decision-making processes ahead of
future storm, or disaster-related events. In addition, this study
performed empirical performance evaluation experiments on Latent
Dirichlet Allocation (LDA) topic models which were generated from
Twitter data collected from Hurricane Florence. The performance
evaluation experiments showed that LDA topic models struggle to
accurately reflect the true latent conversation topics present within a
medium-term, event-based dataset. Although the study successfully
modeled LDA topic models, it could not produce models that were
interpretable by human beings as distinct groups of topic words that
were tightly coupled to one another.