Analyzing location-specific error patterns in train data

More Info


Through the years, companies have been exploring the field of data science. The Nederlandse Spoorwegen (NS) is not an exception to this. Modern trains are equipped with sensors that measure a variety of conditions within the train. This data is being stored in their data warehouse. This data has been proven useful for detection and response times to problems, which warrants two high-level goals of the NS: punctuality and reliability. However, even with the available data, visualization and detection of location-specific problems are not yet implemented. Location-specific problems are problems that are not caused by the train, but by the infrastructure or human fault at that specific location. At the moment, most patterns in error codes are only backed up by suspicions, since these error codes are not stored in a way they are easily readable. Therefore, it is hard to find connections between multiple error codes. This document describes the created system that supports the analysis of location-specific error code patterns. With the system, the NS will be able to improve their two high-level goals and ultimately improve customer satisfaction.

For the system, a framework was made, which allows the NS to further develop and extend on data analyses. Furthermore, an extensive UI was created, allowing users to investigate found error code patterns and trace back problems to their origin. With the system, the NS is able to verify and create new hypotheses on possible problematic locations. In this document, the problem in elaborated on, multiple solutions are given of which one is chosen and thoroughly motivated, the solutions are elaborated on and, finally, some recommendations for future expansion are given.