Abstract:
The worldwide pandemic of severe acute respiratory syndrome coronavirus 2 (SARSCoV-
2) has afflicted the majority of the world’s population. Thanks to the invention of
COVID-19 vaccines, the governments worldwide have been able to control the pandemic
in some way. However, it can be noted that the majority of people are hesitant to share
their experiences on official platforms after being vaccinated. As a result, information
about vaccine-related adverse effects other than clinical trial results has become challenging
to identify. However, many people tend to share their opinions about vaccines
through social media platforms since the COVID-19 vaccination campaigns started
worldwide. This study aims to identify the public perspective on the adverse effects
of COVID-19 vaccines, based on an analysis of social media data. As an initial step,
the researchers tried to detect valid Tweets that contained details on adverse effects of
COVID-19 vaccines. Tweets related to COVID-19 vaccines were collected through the
Kaggle repository, which resulted in over 4257 tweets after data cleaning and removing
duplicates. Collected tweets were manually labeled into two categories: tweets related
to the adverse effects of COVID-19 vaccines and tweets not related to the adverse
effects of COVID-19 vaccines. After the data pre-processing, Support Vector Machine
(SVM) algorithm and Term Frequency-Inverse Document Frequency (TF-IDF) word
embedding technique were used to classify the COVID-19 vaccine-related tweets. The
TF-IDF technique was used to extract features from the text that can be input into
SVM. The best performance of classification, which used SVM, yielded an accuracy of
80.00 % on the test dataset. The recall, precision, and F1-score were 0.85, 0.41, and
0.56 respectively. Overall, this research reveals that the SVM algorithm can be used to
identify the information related to COVID-19 vaccines on social media to explore public
opinion about its adverse effects.