What if fanfiction, but also coding: Investigating cultural differences in fanfiction writing and reviewing with machine learning methods

How has the portrayal of female characters in fanfiction evolved in response to the #MeToo movement and fourth-wave feminism, as analyzed with the help of NLP techniques?

Bachelor Thesis (2025)
Author(s)

I. Marinescu (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Hayley Hung – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

E. Eisemann – Graduation committee member (TU Delft - Computer Graphics and Visualisation)

Chenxu Hao – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

I. Kondyurin – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Coordinates
50.0116, 4.3571
Graduation Date
07-02-2025
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This paper explores how the portrayal of female characters in fanfiction evolved in response to the #MeToo movement and fourth-wave feminism, with the aim of assessing whether the impact of the awareness of the campaign was broad enough to visibly alter how the average author portrays women in narrative contexts. To analyze these trends, fanfiction data from Archive of Our Own (AO3) spanning 2015–2019 was parsed, and two Natural Language Processing (NLP) pipelines — Word2Vec and GloVe, and BERT — were developed. The study finds that bias scores, aggregated through formulas created to compare gendered associations, show a stronger stereotypization of women before 2017 compared to after. Furthermore, a similar trend is discovered in the representation of women in fanfiction. While the BERT pipeline proved most effective for capturing contextual nuances, it is significantly limited by its reliance on binary labels and computational intensity. This further indicates the need for more inclusive and sustainable methods, making the Word2Vec/GloVe models more appropriate for this task. The paper concludes with recommendations for future work, including broader representation, longer-term analysis, and enhanced detection of evolving language patterns.

Files

License info not available