Extracting location context from transcripts

a comparison of ELMo and TF-IDF

More Info
expand_more

Abstract

Using transcripts of the TV-series FRIENDS, this paper explores the problem of predicting the location in which a sentence was said. The research focuses on using feature extraction on the sentences, and training a logistic regression model on those features. Specifically looking at the differences in performance between using ELMo and TF-IDF for this feature extraction, achieving an accuracy rate of 58\% and 67\% respectively on a binary classification. The paper also explores the effect of several data cleaning techniques on the results.

Git repository containing the source code used in the paper - https://github.com/David-Happel/scene-location-NLP

Files