PK

P.K. Krishnakumari

info

Please Note

34 records found

Journal article (2026) - Yanyan Xu, Panchamy Krishnakumari, Neil Yorke-Smith, Serge Hoogendoorn
This article proposes an evidence-based policy recommendation framework integrating social media data and natural language processing methods, to support inclusive and efficient transport policy-making. Given that current research underscores the crucial role of both external and psychological variables in individual travel decisions, psychological features – such as beliefs, attitudes or values – are frequently used as latent variables for travel behaviour interpretation and travel choice modelling. However, user-centric policy recommendations based on dynamic psychological variables are still limited. Most studies rely on survey data, which neglects the urgent dynamic trend of user perception change and its underlying relationship with travel behaviour. Hence there is a lack of illustration on how these psychological variables can be further used at specific temporal and spatial levels for travel behaviour interpretation. This would be valuable to identify priorities for more targeted (sustainability and other) policies and interventions. In this article, we utilize sentiment analysis and dynamic topic modelling to represent the spatial–temporal variance of psychological features. Integrating with corresponding travel behaviour, we illustrate how these dynamic psychological features can distinguish travel dissonance, identify key motivations, and reflect urgent social demands at precise spatial–temporal levels. We demonstrate these advances in a case study in New York City from 2019 to 2022 using Twitter (X) data. A comparison with existing travel-related policies in the case study validates the feasibility of our framework to support evidence-based policy recommendations. We conclude by discussing the potential of this framework to support sustainable transport promotion. ...
Journal article (2025) - Xiamei Wen, Panchamy Krishnakumari, Serge Hoogendoorn
Accurate short-term predictions of active mode traffic are crucial for effective urban traffic control and management, helping to reduce delays, stops, and improve travel time reliability, and optimize travel route choice. While most methods focus on motorized traffic, active modes like walking and cycling have been overlooked due to their complex dynamics and sensitivity to external factors like weather and individual choices, making them inherently less predictable. To address this, we propose a Dynamic Attention-based Spatial-Temporal Graph Convolutional Network (DyASTGCN) model that incorporates the impact of weather on graph spatial correlations within the active mode traffic network. Additionally, we introduce a fusion approach to integrate various heterogeneous spatial correlations, aiming to represent the optimal spatial correlations within the active mode network. Experimental results demonstrate that weather changes have a lagging effect on traffic network spatial correlations. Specifically, active mode traffic demonstrates significant sensitivity to precipitation, with notable changes in spatial correlations occurring within 5 minutes. Conversely, it takes approximately 20 minutes for spatial correlations to respond to wind speed influences. By incorporating both precipitation and wind speed with a 20-minute lag, our model outperforms those using only one feature, achieving the best traffic prediction performance. Given the uncertain traffic state and highly sparse nature of active mode data, our fusion approach adeptly captures the essential spatial correlations required for accurate traffic flow prediction. This allows our model to better understand complex graph correlations and traffic patterns, improving prediction accuracy and offering valuable insights into active mode network dynamics. ...
Journal article (2025) - Weiming Mai, Dorine Duives, Panchamy Krishnakumari, Serge Hoogendoorn
Crowd management plays a vital role in urban planning and emergency response. Accurate crowd prediction is important for venue operators to respond effectively to adverse crowd dynamics during large gatherings. Although many studies have tried to predict crowd densities or movement dynamics with data-driven predictive models, their validation is often limited to data within the same scenario. As a result, the predictability of the data-driven model in unseen scenarios, such as evacuation scenarios, remains unknown due to the challenges of collecting out-of-distribution data regarding emergency conditions. To address this problem, we present an evaluation pipeline to evaluate different kinds of data-driven models. A method is proposed to generate realistic scenarios by simulation and collect synthetic data from these scenarios to acquire a comprehensive dataset. With these synthetic data, we evaluated different predictive models, from traditional machine learning methods to deep learning time-series prediction models, to explore their generalizability. Furthermore, we propose a weighted average metric, which is better suited to determine the performance of forecasting algorithms under adverse conditions. Through extensive experimentation, we showcase the heterogeneity and diversity of the simulation dataset. The evaluation results also revealed that all the data-driven models performed poorly in unseen scenarios, highlighting the urgent need to develop a robust and generalizable model for predicting crowd flow in indoor spaces. ...
Journal article (2025) - Zahra Eftekhar, Saman Behrouzi, Panchamy Krishnakumari, Adam Pel, Hans van Lint
Large-scale prediction of trip production is essential for origin–destination (OD) demand estimation and prediction. One of the main challenges in predicting trip production patterns lies in addressing spatial-temporal correlations and variations. Whereas many studies focus on temporal correlations, very few consider spatial adjacency between traffic analysis zones (TAZ) as explanatory variables. This research proposes a method that integrates a graph convolutional neural network (GCN) into a long short-term memory network (LSTM) to do exactly that. By introducing a nationwide graph that encodes the adjacency of TAZs, spatial heterogeneity is considered in the prediction process, and a single prediction model is trained for the entire network, thereby avoiding the need to train multiple separate models and potentially reducing overall training overhead, while increasing the prediction accuracy. Moreover, with this model, we investigate the effect of spatial scale on spatial uncertainty and prediction accuracy and analyze prediction errors, residual patterns, and their associations with socio-spatial features at different spatial scales. The findings of this research have important implications for improving OD demand prediction models and provide valuable insights into the role of spatial scale and socio-spatial features in travel demand prediction. ...
Journal article (2024) - Yuxing Cheng, Panchamy Krishnakumari
To analyze inherent and diverse patterns within line-based public transport daily delay occurrences, we introduce a data-driven exploratory analysis focused on the spatial-temporal distribution of these delays. Our approach relies on the utilization of the image pattern recognition technique and k-means clustering algorithm. We extract daily punctuality information from the automatic vehicle location data for a singular public transport route. This information is then translated into a visual representation through aggregated daily delay distribution profile images, offering insights into the spatial and temporal distribution of delays. The delay distribution finds expression in the arrangement of pixels within these profile images. The essence of these images is further distilled through image pattern recognition using the neural network architecture of ResNet50. Employing the k-means algorithm, we cluster these images based on their similarity, revealing five distinct daily delay patterns. The analysis of these patterns offers insight into their unique characteristics, yielding noteworthy outcomes. These findings hold the potential to provide public transport operators with an enriched comprehension of the dynamics of delays occurring on a specific line. ...
Journal article (2024) - Ting Gao, Winnie Daamen, Panchamy Krishnakumari, Serge Hoogendoorn
To promote urban sustainability, many cities are adopting bicycle-friendly policies, leveraging GPS trajectories as a vital data source. However, the inherent errors in GPS data necessitate a critical preprocessing step known as map-matching. Due to GPS device malfunction, road network ambiguity for cyclists, and inaccuracies in publicly accessible streetmaps, existing map-matching methods face challenges in accurately selecting the best-mapped route. In urban settings, these challenges are exacerbated by high buildings, which tend to attenuate GPS accuracy, and by the increased complexity of the road network. To resolve this issue, this work introduces a map-matching method tailored for cycling travel data in urban areas. The approach introduces two main innovations: a reliable classification of road availability for cyclists, with a particular focus on the main road network, and an extended multi-objective map-matching scoring system. This system integrates penalty, geometric, topology, and temporal scores to optimize the selection of mapped road segments, collectively forming a complete route. Rotterdam, the second-largest city in the Netherlands, is selected as the case study city, and real-world data is used for method implementation and evaluation. Hundred trajectories were manually labelled to assess the model performance and its sensitivity to parameter settings, GPS sampling interval, and travel time. The method is able to unveil variations in cyclist travel behavior, providing municipalities with insights to optimize cycling infrastructure and improve traffic management, such as by identifying high-traffic areas for targeted infrastructure upgrades and optimizing traffic light settings based on cyclist waiting times. ...

A case study in the Netherlands during the COVID-19 pandemic

Journal article (2024) - Lucia Van Schaik, Dorine Duives, Sascha Hoogendoorn-Lanser, Jan Willem Hoekstra, Winnie Daamen, Alexandra Gavriilidou, Panchamy Krishnakumari, Marco Rinaldi, Serge Hoogendoorn
Physical distancing has been an important asset in limiting the SARS-CoV-2 virus spread during the COVID-19 pandemic. This study aims to assess compliance with physical distancing and to evaluate the combination of observed and self-reported data used. This research shows that it is difficult to operationalize new rules, that context affects compliance, that there needs to be a need for compliance, and that rules require upkeep. From a methodological point of view, this study found that the combined methods provide a comprehensive picture of compliance behaviour, that it is challenging but essential to mitigate response fatigue in long-term monitoring studies, and that it would be interesting in future research to learn how actual behaviour is influenced by personal narratives. ...

A joint origin–destination-path-choice formulation

Journal article (2024) - Yumin Cao, Hans van Lint, Panchamy Krishnakumari, Michiel Bliemer
This paper presents a novel approach to data-driven time-dependent origin–destination (OD) estimation using a joint origin–destination-path choice formulation, inspired by the well-known equivalence of doubly constraint gravity models and multinomial logit models for joint O–D choice. This new formulation provides a theoretical basis and generalizes an earlier contribution. Although including path choice increases the dimensionality of the problem, it also dramatically improves the quality of the data one can directly use to solve it (e.g. measured path travel times versus coarse centroid-to-centroid travel times); and opens up possibilities to combine different assimilation techniques in a single framework: (1) fast shortest path set computation using static (e.g. road type) and dynamic (speed, travel time) link properties; (2) predicting a “prior OD matrix” using the resulting path-shares and (estimated or measured) production and attraction totals; and (3) scaling/constraining this prior using link flows (informative of demand). If the resulting system of equations has insufficient rank, we use principal component analysis to reduce the dimensionality, solve this reduced problem, and transform that solution back to a full OD matrix. Comprehensive tests and sensitivity analysis on 7 networks with different sizes and characteristics give an empirical underpinning of the extended equivalence principle; demonstrate good accuracy and reliability of the OD estimation method overall; and suggest that the method is robust with respect to major assumptions and contributing factors. ...
Journal article (2024) - Zili Wang, Panchamy Krishnakumari, Kumar Anupam, Hans van Lint, Sandra Erkens
The relationship between real-world traffic and pavement raveling is unclear and subject to ongoing debates. This research proposes a novel approach that extends beyond traditional correlation analyses to explore causal mechanisms between mixed traffic and raveling. This approach incorporates the causal discovery method, and is applied to five Dutch porous asphalt (PA) highway sites that have substantial data sets. Findings indicate a nonlinear relationship between traffic volume and raveling, with road age emerging as a shared contributor. The results also suggest that the degree to which different vehicle types contribute as a causal factor for raveling varies with carriageway configurations and lane characteristics. This underlines the need for targeted maintenance strategies. Challenges remain due to confounding correlations among traffic variables, necessitating further development of causal discovery models. This study may not conclusively resolve the debate on to what extent traffic contributes to raveling, but we argue we provide sufficient evidence against rejecting this hypothesis. ...
Conference paper (2023) - Mingjia He, Panchamy Krishnakumari, Ding Luo, Jiaqi Chen
With the electrification in freight transportation, the availability of fast-charging facilities becomes essential to facilitate en-route charging for freight electric vehicles. Most studies focus on planning charging facilities based on mathematical modeling and hypothetical scenarios. This study aims to develop a data-driven integrated framework for fast-charging facility planning. By leveraging the highway traffic data, we extracted, analyzed, and compared spatial and temporal flow patterns of general traffic and freight traffic. Furthermore, graph theory-based network evaluation methods are employed to identify traffic nodes within the highway network that play a significant role in accommodating charging infrastructure. A candidate selection method is proposed to obtain potential deployment locations for charging stations and to-go chargers. Based on this, we present a multi-period bi-objective optimization model to provide optimal solutions for the placement of charging facilities, with the objectives of minimizing investment cost and maximizing demand coverage. The case study on the Amsterdam highway network shows how existing traffic data can be used to generate more realistic charging demand scenarios and how it can be integrated and evaluated within the optimization framework for facility planning. The study also shows that the proposed model can leverage the potential of early investment in improving the charging demand coverage. ...
Conference paper (2023) - Mahsa Movaghar, Saman Behrouzi, Panchamy Krishnakumari, Serge Hoogendoorn, Hans Van Lint
Road incidents, including accidents, greatly impact public safety, traffic flow, and overall transportation system functioning. Detecting and predicting incidents is crucial for effective incident management. Accurate algorithms rely on high-quality incident data sets. However, uncertainties exist due to the collection and recording process. To address this, cross-validating incident data with other datasets helps resolve inaccuracies. Additionally, enriching incident data with additional sources enables a more precise analysis of societal costs for planning purposes. In this study, we utilize traffic congestion data to examine and quantify the consequences of incidents on the Dutch highway network. First, we map match recorded incidents with related traffic patterns. Then, we label incidents as 'congestion' if significant congestion patterns were identified during or after the incidents or as 'no-congestion' if no significant congestion pattern occurred. For incidents labeled as congestion, we calculate and associate records with the congestion's duration, location, and Vehicle Loss Hours (VLH). The developed methodology has been implemented on five months of recorded data for the six most significant motorways in the Netherlands. This enriched dataset can be utilized for incident detection algorithms, analysis and management, and policy and decision-making. ...
Conference paper (2023) - Yanyan Xu, Panchamy Krishnakumari, Neil Yorke-Smith, Serge Hoogendoorn
COVID-19 significantly influenced travel behaviours and public attitudes towards public transport. Various studies have illustrated complicated factors related to long-term travel behaviour, indicating difficulty in understanding and predicting post-pandemic long-term travel behaviour via traditional methods. In these complex circumstances, it is valuable to take advantage of social media data to obtain real-time public opinions to understand dynamic travel behaviour changes from the passenger perspective. The present study provides a means - leveraging Twitter data and state-of-art Natural Language Processing (NLP) technologies - to interpret the underlying associations among public attitude, COVID-19 trends and public travel behaviour. Concretely, New York City has been selected due to its dependence on public transit for daily commuting. More than 500K tweets have been collected from January 2019 to June 2022. Automated text mining, topic modelling, and sentiment analysis have been implemented in these contexts to identify dynamic public reactions. A consistently negative attitude to public transit is detected and five main topics, including derivative topics from COVID-19, are discovered within the COVID-19 duration. Policy makers and transit managers can use these topics to take onboard the public's concerns. The paper thus exemplifies how social media data and NLP technologies can support policy-making progress and can benefit other tasks in the transportation domain. ...
Preprint (2023) - Yan Feng, Panchamy Krishnakumari
Understanding pedestrian route choice behavior in complex buildings is important to ensure pedestrian safety. Previous studies have mostly used traditional data collection methods and discrete choice modeling to understand the influence of different factors on pedestrian route and exit choice, particularly in simple indoor environments. However, research on pedestrian route choice in complex buildings is still limited. This paper presents a data-driven approach for understanding and predicting the pedestrian decision point choice during normal and emergency wayfinding in a multi-story building. For this, we first built an indoor network representation and proposed a data mapping technique to map VR coordinates to the indoor representation. We then used a well-established machine learning algorithm, namely the random forest (RF) model to predict pedestrian decision point choice along a route during four wayfinding tasks in a multi-story building. Pedestrian behavioral data in a multi-story building was collected by a Virtual Reality experiment. The results show a much higher prediction accuracy of decision points using the RF model (i.e., 93% on average) compared to the logistic regression model. The highest prediction accuracy was 96% for task 3. Additionally, we tested the model performance combining personal characteristics and we found that personal characteristics did not affect decision point choice. This paper demonstrates the potential of applying a machine learning algorithm to study pedestrian route choice behavior in complex indoor buildings. ...
Journal article (2023) - Jinlei Zhang, Yijie Chen, Krishnakumari Panchamy, Guangyin Jin, Chengcheng Wang, Lixing Yang
Accurate and reliable short- term passenger flow prediction can support operations and decision-making of the URT system from multiple perspectives. In this paper, we propose a URT multi- step short- term passenger flow prediction model at the network level based on a Transformer-based LSTM network, Depth-wise Attention Block, and CNN network, named as Spatial- Temporal Integrated Prediction Model (STIPM). The STIPM comprises three branches. The first branch takes time- series inflow data as input, and a Transformer-based LSTM network is selected to extract the temporal correlations. The second one takes timestep- based OD data as input, and many spatial and temporal features are captured using Depth- wise Attention Blocks. Meanwhile, timestep- based OD data can better include inter- station relations and global information. The third branch takes Point of Interest data (POI) as input and CNN network is utilized for spatiotemporal features extraction, which can also become the bridge between spatial and temporal features. Moreover, the“Multi-inputmulti- output Strategy”for multi- step prediction is used to obtain a longer prediction period and more detailed information under a relatively high forecasting accuracy. The STIPM is applied to two large- scale real- world datasets from the URT system, and the obtained prediction results are compared with ten baselines and four variants from itself, in which STIPM model achieves highest prediction accuracy indicated by RMSE, MAE, and WMAPE evaluations, which demonstrates the superiority and robustness of the STIPM. ...
Conference paper (2023) - Xiamei Wen, Panchamy Krishnakumari, Serge Hoogendoorn
Accurate prediction of active mode traffic is imperative for optimizing traffic operations in Intelligent Trans-portation Systems. However, existing data-driven approaches heavily rely on extensive datasets to achieve reliable traffic prediction. This dependence poses a challenge when it comes to data sharing, particularly when collecting information from multiple local clients, such as institutions, organizations, and mobile devices, and transmitting it to a central server for model training and application. To overcome this challenge and enhance data security, we introduce the FedASTGNN model for active mode traffic prediction. This approach combines the federated averaging (FedAvg) algorithm with an attention-based spatial-temporal graph neural network (ASTGNN) model. Subsequently, we conduct an evaluation to determine the performance gap between the centralized ASTGNN model and the proposed distributed FedASTGNN model. This evaluation takes into account the model's performance across different time aggregation intervals and prediction horizons. Moreover, considering the unique attributes and intricacies of active mode data, we create three scenarios to demonstrate the influence of diverse active mode data from different local clients (subnet-works) on the FedASTGNN model. The findings of our study illustrate that the FedASTGNN model effectively preserves the advantages of the ASTGNN model while ensuring data confidentiality in active mode traffic prediction. Furthermore, we observe that the performance of the FedASTGNN model is significantly affected by the varying degrees of imbalanced data distribution among subnetworks. The insights shed light on the potential and challenges presented by the FedASTGNN model as an efficient and secure solution for predicting active mode traffic in Intelligent Transportation Systems. ...
Conference paper (2022) - Z. Wang, P. Krishnakumari, K. Anupam, J. W. C. van Lint, S. M. J. G. Erkens
Understanding the relationship between pavement raveling and traffic characteristics is important to pavement management and maintenance planning. In this work, we propose a framework to empirically quantify this relationship. It consists of an alignment method to tackle the inconsistent spatial-temporal scales of the raveling and traffic measurements and we propose spatial-temporal maps to qualitatively analyze and compare the data. A non-parametric correlation is done on the aligned raveling and traffic flow data. This framework is applied to five study areas in the Dutch highway network. The correlation analysis of the study areas provides empirical evidence to a commonly held theory that traffic flow has effects on raveling. Categorizing the correlation by lanes indicates that the raveling is homogeneous in the through or auxiliary lanes, and the severe raveled sections are parallel to the road discontinuity, suggesting the potential effect of mandatory lane changing on raveling development. The proposed framework can be employed in empirical raveling models that predict raveling based on traffic and other factors. ...
Conference paper (2022) - Z. Wang, A.J. Pel, T. Verma, P.K. Krishnakumari, Peter van Brakel, N. van Oort
Predictions on public transport ridership are beneficial as they allow for sufficient and cost-efficient deployment of vehicles. At an operational level, this relates to short-term predictions with lead times of less than an hour. Where conventional data sources on ridership, such as Automatic Fare Collection (AFC) data, may have longer lag times, in contrast, trip planner data is often available in (near) real-time. This paper analyzes how such data from a trip planner app can be utilized for short-term bus ridership predictions. This is combined with AFC data (in this case smart card data) to construct a ground-truth on actual ridership. The trip planner data is studied using correlation analysis to select informative variables, that are then used to develop 4 supervised machine learning models (linear, k-nearest neighbors, random forest, and gradient boosting decision tree). The best performing model relies on random forest regression and reduces the error by approximately half compared to a baseline model based on the weekly trend. We show that this model performance is maintained even for prediction lead times up to 30 minutes ahead, and for different periods of the day. ...
Journal article (2022) - Ziyulong Wang, Adam J. Pel, Trivik Verma, Panchamy Krishnakumari, Peter van Brakel, Niels van Oort
Predictions on Public Transport (PT) ridership are beneficial as they allow for sufficient and cost-efficient deployment of vehicles. On an operational level, this relates to short-term predictions with lead times of less than an hour. Where conventional data sources on ridership, such as Automatic Fare Collection (AFC) data, may have longer lag times and contain no travel intentions, in contrast, trip planner data are often available in (near) real-time and are used before traveling. In this paper, we investigate how such data from a trip planner app can be utilized for short-term bus ridership predictions. This is combined with AFC data (in this case smart card data) to construct a ground truth on actual ridership. Using informative variables from the trip planner dataset through correlation analysis, we develop 3 supervised Machine Learning (ML) models, including k-nearest neighbors, random forest, and gradient boosting. The best-performing model relies on random forest regression with trip planner requests. Compared with the baseline model that depends on the weekly trend, it reduces the mean absolute error by approximately half. Moreover, using the same model with and without trip planner data, we prove the usefulness of trip planner data by an improved mean absolute error of 8.9% and 21.7% and an increased coefficient of determination from a 5-fold cross-validation of 7.8% and 18.5% for two case study lines, respectively. Lastly, we show that this model performance is maintained even for the trip planner requests with prediction lead times up to 30 min ahead, and for different periods of the day. We expect our methodology to be useful for PT operators to elevate their daily operations and level of service as well as for trip planner companies to facilitate passenger replanning, in particular during peak hours. ...
Journal article (2021) - S.P. Hoogendoorn, W. Daamen, Y. Yuan, P.K. Krishnakumari
In dit artikel worden de effecten van COVID-19 op de (relevante onderdelen van de) aanbodkant van het mobiliteitssysteem beschreven. Gezien de verwachte impact ligt de focus op voetgangersstromen, fietsstromen, het gebruik van deeldiensten en openbaar vervoer voertuigen. We kijken naar het effect op de capaciteit, via gemeten of theoretisch verwachte effecten op het gedrag. We maken hierbij onderscheid tussen eenvoudige infrastructuur elementen (een voetpad, een fietspad) en netwerken of knooppunten, waarbij meer complexe interacties optreden. Dit is relevant om te kunnen inschatten welke vraag door het mobiliteitssysteem kan worden verwerkt, voordat overbelasting (congestie, vertragingen, verdrukking) ontstaat. Het artikel start met een theoretisch raamwerk waarin we de verschillende aspecten die invloed hebben op de aanbodkant van het mobiliteitssysteem in kaart brengen. Vervolgens gaan we in op de verschillende (exit) scenario’s die internationaal zijn uitgerold en waarop deze impact hebben. De effecten zijn zo goed mogelijk gekwantificeerd op grond van a) bevindingen uit de literatuur, b) theoretische analyses, en - waar mogelijk - c) analyse van beschikbare data.De belangrijkste bevinding van dit onderzoek is dat de doorstroomcapaciteit van enkelvoudige infrastructuur elementen met 60 tot 70 procent kan afnemen, mits mensen zich strikt aan de 1,5 meter afstand houden. Indien we naar netwerken of (multimodale) knooppunten kijken, dan zien we dat deze afname groter wordt, afhankelijk van hoe effectief de ruimte in het knooppunt wordt - of kan worden - benut. We laten zien hoe lokale kenmerken zich doorvertalen naar het gehele knooppunt en we laten zien hoe de spreiding in ruimte leidt tot extra reductie van de doorstroomcapaciteit. De effecten met betrekking tot de opslagcapaciteit – hoeveel verkeersdeelnemers kunnen zich maximaal in een netwerk of knooppunt bevinden - zijn nog groter dan voor de doorstroomcapaciteit, wederom gegeven de opvolging van de 1,5 meter maatregel. Deze bevindingen laten zien dat de 1,5 meter maatregelen bij vergelijkbare vraag zou leiden tot enorme afwikkelingsproblemen. Uit eerste data analyses blijkt echter dat de 1,5 meter afstand vaak met voeten wordt getreden en dat veel interacties plaatsvinden op minder dan 1,5 meter afstand. Wel is onze bevinding dat in de verschillende fasen van de COVID-19 crisis de opvolging van de 1,5 meter regel verandert. ...
Journal article (2020) - Panchamy Krishnakumari, Oded Cats, Hans van Lint
The biggest challenge of analysing network traffic dynamics of large-scale networks is its complexity and pattern interpretability. In this work, we present a new computationally efficient method, inspired by human vision, to reduce the dimensions of a large-scale network and describe the traffic conditions with a compact, scalable and interpretable custom feature vector. This is done by extracting pockets of congestion that encompass connected 3D subnetworks as 3D shapes. We then parameterize these 3D shapes as 2D projections and construct parsimonious feature vectors from these projections. There are various applications of these feature vectors such as revealing the day-to-day regularity of the congestion patterns and building a classification model that allows us to predict travel time from any origin to any destination in the network. We demonstrate that our method achieves a 44% accuracy improvement when compared against the consensus method for travel prediction of an urban network of Amsterdam. Our method also outperforms historical average methods, especially for days with severe congestion. Furthermore, we demonstrate the scalability of the approach by applying the method on the entire Dutch highway network and show that the feature vector was able to encapsulate the network dynamics with a 93% prediction accuracy. There are many paths to further refine and improve the method. The compact form of the feature vector allows us to efficiently enrich it with more information such as context, weather and event without increasing the computational complexity. ...