Publication: UNDERSTANDING CONSTRUCTION SITE SAFETY HAZARDS THROUGH OPEN DATA: TEXT MINING APPROACH
Type:
Article
Date
2021-10
Journal Title
Journal ISSN
Volume Title
Publisher
researchgate.net
Abstract
Construction is an industry well known for its very high rate of injuries and accidents around the
world. Even though many researchers are engaged in analysing the risks of this industry using various
techniques, construction accidents still require much attention in safety science. According to
existing literature, it has been found that hazards related to workers, technology, natural factors,
surrounding activities and organisational factors are primary causes of accidents. Yet, there has been
limited research aimed to ascertain the extent of these hazards based on the actual reported accidents.
Therefore, the study presented in this paper was conducted with the purpose of devising an approach
to extract sources of hazards from publicly available injury reports by using Text Mining (TM) and
Natural Language Processing (NLP) techniques. This paper presents a methodology to develop a
rule-based extraction tool by providing full details of lexicon building, devising extraction rules and
the iterative process of testing and validation. In addition, the developed rule-based classifier was
compared with, and found to outperform, the existing statistical classifiers such as Support Vector
Machine (SVM), Kernel SVM, K-nearest neighbours, Naïve Bayesian classifier and Random Forest
classifier. The finding using the developed tool identified the worker factor as the highest contributor
to construction site accidents followed by technological factor, surrounding activities, organisational
factor, and natural factor (1%). The developed tool could be used to quickly extract the sources of
hazards by converting largely available unstructured digital accident data to structured attributes
allowing better data-driven safety management.
Description
Keywords
Construction, Hazards, Natural language processing, Safety, Text mining
Citation
Rupasinghe, Heshani & Panuwatwanich, Kriengsak. (2021). UNDERSTANDING CONSTRUCTION SITE SAFETY HAZARDS THROUGH OPEN DATA: TEXT MINING APPROACH. 11. 160-178. 10.11113/aej.v11.17871.
