Abstract:
The recent climate changes have significantly increased the number and intensity of natural disasters around the world. This includes floods that cause a great deal of damage to properties, and more importantly, to the lives of the people. The reporting of current disasters has changed from official media to public reporters through social media and crowdsourcing technologies which have guaranteed the availability and up-to-date nature of the reported data. However, crowdsourced data (CSD) are often questioned due to issues in reliability and relevancy, heterogeneity or bias, bad structure, and un-professionalism. As a result of this, disaster responders are reluctant to use such data for their critical decision making actions. Using Natural Language Processing (NLP) and Geographic Information Retrieval (GIR) techniques, this study evaluated the quality of CSD, focusing on the thematic relevance. The study examined a proof of concept on relevance assessment based on an improved set of user queries utilizing crowdsourced messages from the 2011 Australian floods (Ushahidi Crowd-map). The findings show that the approach was effective in generating a thematically rated list of CSD messages for post-flood disaster managers to confidently take actions. The study's future work will consider thematic and geographic specificities and semantic context of the modified queries. Moreover, it is expected to test the approach with similar geospatial crowd-map data, and finally to check the possibility of integrating the derived information with authoritative datasets such as Spatial Data Infrastructures (SDIs).