Extracting location context from transcripts

None, None

Extracting location context from transcripts

a comparison of ELMo and TF-IDF

Bachelor Thesis (2020)

Author(s)

D.V. Happel (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

David M. J. Tax – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

M. Loog – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Tom J. Viering – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

S. Makrodimitris – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Arman Naseri – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Natural Language Processing Text Classification Word embedding TF-IDF ELMo

To reference this document use:

https://resolver.tudelft.nl/uuid:ad4e3624-4f39-4a64-a678-c232e3f8d7da

More Info

expand_more

Publication Year

2020

Language

English

Copyright

Graduation Date

22-06-2020

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Abstract

Using transcripts of the TV-series FRIENDS, this paper explores the problem of predicting the location in which a sentence was said. The research focuses on using feature extraction on the sentences, and training a logistic regression model on those features. Specifically looking at the differences in performance between using ELMo and TF-IDF for this feature extraction, achieving an accuracy rate of 58\% and 67\% respectively on a binary classification. The paper also explores the effect of several data cleaning techniques on the results.

Git repository containing the source code used in the paper - https://github.com/David-Happel/scene-location-NLP

Files

Research_Paper.pdf

(pdf | 0.358 Mb)

License info not available