Print Email Facebook Twitter Weak signal detection on twitter datasets: A non-accumulated approach for non-famous events Title Weak signal detection on twitter datasets: A non-accumulated approach for non-famous events Author Song, B. Contributor Abel, F. (mentor) Houben, G.J. (mentor) Faculty Electrical Engineering, Mathematics and Computer Science Department Software and Computer Technology Programme Web Information Systems Date 2012-08-31 Abstract With the rise of social networking platforms, a great amount of data has been generated. Extracting information regarding interesting events from this large data pool has become an interesting direction of research. Multiple tools have been built for this challenge. Detecting the signals for a target event has thus become a crucial task and it is the starting point for many subsequent activities, such as, for example, extracting information from the targeted text. Meanwhile, most event detection relies on a certain volume of messages for its detection. This may lead to the effect that only famous events are detected with their strong signals. Waiting for the accumulation of signals for them to become strong signals may delay the detection of an event. So, our goal is an approach which can detect events based on weak signals represented by very few messages. In this thesis work, we designed, implemented and evaluated a feature-based weak signal detection approach on a Twitter dataset by applying a machine learning method on a sample use case with a manually labeled dataset. The approach achieved a high rate for correct classification of 0.885 and an F-measure of 0.892 by combining semantic and sentiment features to the traditional syntactic features. During this procedure, we also gave detailed analysis on the features we designed, and showed how they worked together to give a better result. The performance has also been compared with a key word based classifier, and our approach has given a more than 10 percent improvement on the correct classification rate. It also showed that the semantic and sentiment features could lift the performance given by syntactic features. We also see how the approach itself has the potential to be generalized. Subject twitterweak signalevent detection To reference this document use: http://resolver.tudelft.nl/uuid:d82980e2-b9e1-497c-a1f0-be1d379f081b Embargo date 2013-08-31 Part of collection Student theses Document type master thesis Rights (c) 2012 Song, B. Files PDF Thesis.pdf 1.05 MB Close viewer /islandora/object/uuid:d82980e2-b9e1-497c-a1f0-be1d379f081b/datastream/OBJ/view