Unravelling Twitter chaos during a policy crisis

Applying Sentiment Analysis and Topic Modelling to Tweets about the Dutch Nitrogen Crisis

More Info
expand_more

Abstract

In May 2019, the Dutch Council of State rejected the national approach for reducing nitrogen emissions in Dutch nature. Farmers were targeted by the policy change: all licenses for agricultural expansion were revoked, affecting the financial livelihoods of farmers. Farmers did not take this well and, using social media platforms, started organising large-scale demonstrations. Twitter flushed with posts about the demonstrations, and pictures and videos of the event went viral. Demonstrations result from social unrest, and social unrest starts with public dissatisfaction. The Nitrogen Crisis showed it is in the interest of decision-makers to monitor what negative feelings the public holds towards policies and act on these, before these feelings grow into social unrest. Twitter is a social media platform many users come to for expressing their opinions. Because of this, this study looks at what insights can be derived from Twitter about events, like demonstrations, that took place during the Nitrogen Crisis. For this, this study applies two Natural Language Processing methods: sentiment analysis and topic modelling (LDA). These methods are combined in order to create more insightful and interpretable results than the methods individually could provide. Two interviews are held with an expert on the Nitrogen Crisis to provide context on the crisis and to identify major events that received a lot of media attention. The events are plotted with- and compared to the results of sentiment analysis and topic modelling. In doing so, the following research question is answered:
How can sentiment analysis and topic modelling be applied to Twitter data to provide insights for decision-makers retrospectively about major events during the Dutch Nitrogen Crisis?
For sentiment analysis, two Dutch sentiment analysis tools are implemented and compared to the sentiment scores of 100 tweets by three annotators to select the best performing one. For topic modelling, a grid search is performed to choose the combination of timeframe and number of topics that result in the set of topic models that have the highest mean topic coherence. Also, a method is proposed for using topic models to represent changes in the topics discussed on Twitter over time. This is used not only to compare subsequent topic models per sentiment that are one week apart, but also topic models that are 4 weeks apart.
This research develops a fully functioning pipeline for collecting and processing tweets, applying sentiment analysis and topic modelling and plotting the outcomes. This pipeline has been validated at various points, leading to a scientifically viable methodology.
Unexpectedly, it is not sentiment analysis or topic modelling results that have the most obvious connection with the events identified: it is an increase in tweets during events. Therefore, while lacking better tools, decision-makers are recommended to monitor pre-determined topics on Twitter and implement a way to be notified when a significant change in volume of tweets takes place. The combination of sentiment analysis and topic modelling as implemented in this research is either not advanced enough to provide useful information to decision-makers, or sentiment analysis and topic modelling simply cannot provide insightful results on the Dutch Nitrogen Crisis. However, because there is an extensive amount of research applying these methods to social media data around various political events with valuable results, it is recommended to perform more experiments with this approach, and the quality of each research step needs to be further improved in order to draw final conclusions on the usefulness of the combination of sentiment analysis and topic modelling for decision-makers during policy crises. Various improvements for each step in this research are suggested to gain more precise, insightful and interpretable results. In summary, this study and the pipeline it proposes can serve as a solid basis for further development into a process that provides ready-to-use information to decision-makers.