A.L.D. Latour | TU Delft Repository

Polarisation in online social networks

Detecting polarisation using cross-community social ties

Bachelor thesis (2026) - C. Olaru, Anna L.D. Latour, Megha Khosla

The last decade has seen an uptick in political polarisation on online social networks. This has led to increasing distrust and antagonism, sometimes resulting in conflict. As such, studying the emergence of political polarisation is increasingly important in
the current political landscape. The field of network science provides ways to model the underlying structure of social networks. This facilitates the measurement and prediction of polarisation. We propose a way of predicting the emergence of political polarisation by looking at the community structure of a network. Specifically, we examine the extent to which social ties between members of different communities influence how political polarisation takes form. We measure polarisation on
two levels: ideological and relational. We experiment on synthetic networks, generated using two different models. One of the models produces networks with homogeneous community structures. The other generates them according to power law distributions. Our findings show a negative correlation between the fraction of cross-community social ties and the level of polarisation in the network. ...

Polarisation and influence in online social networks

Recovering from Viral Narratives using Delayed Interventions and Strategic Seeding in Polarised Networks

Bachelor thesis (2026) - A. Găloiu, A.L.D. Latour, M. Khosla

Online social networks enable the rapid spread of competing narratives, including misinformation and corrective information campaigns. While previous work has studied competitive diffusion, community structure and influence maximisation separately, less attention has been given to how these factors interact when competing narratives spread according to different diffusion models.

We study the Competing Narrative Diffusion Problem, in which a viral narrative competes against a reinforcement-based intervention. We model the viral narrative using the Independent Cascade (IC) model and the corrective intervention using the Linear Threshold (LT) model. Using Stochastic Block Model and Lancichinetti-Fortunato-Radicchi networks, we investigate how community segregation, intervention timing and strategic seed placement affect diffusion outcomes.

Our results show that community structure plays a central role in determining narrative dominance. Strongly segregated networks favour LT diffusion, while increasing cross-community connectivity consistently shifts outcomes towards IC diffusion. Delayed interventions become less effective as delay increases, but remain competitive in highly segregated networks. Finally, strategic seed placement substantially improves intervention performance and allows LT diffusion to outperform IC diffusion across a much wider range of network structures.

These findings demonstrate that the effectiveness of counter-narratives depends not only on the diffusion model itself, but also on the interaction between community structure, intervention timing and seed selection. Understanding these interactions can help inform the design of more effective interventions against harmful information propagation in online social networks. ...

Polarisation and influence in online social networks

Annealed Heuristic Search for Local Community Detection in Probabilistically Weighted Graphs

Bachelor thesis (2026) - A. Bărdaş, A.L.D. Latour, M. Khosla

In a polarised online social network, a user is exposed to many posts and accounts from across the platform. As a result, the social graph is dense and only weakly clustered. However, most real, meaningful interactions stay within a user's own community. Community membership is therefore hard to read from the topology (a user links to many groups), but clear in the interaction intensity (the edge weights). We study how to extract a local community from a seed user when these edge weights are probabilities rather than certainties. We use the standard negative-log transform to turn each edge probability into an additive cost, so a shortest-path search grows the community by selecting the most likely ties. Our method is an annealing schedule for a structural heuristic that we suppress while the set is small and phase in once the community is large enough to have a stable boundary. A single run produces one ordering of nodes, from which a budget-free patience policy selects the community without knowing its size in advance. We tune every method on training graphs and test on disjoint graphs. Under this fair comparison, the schedule helps in one specific scenario: when the community structure is weak, but the edge weights still carry a partial community signal. There it beats strong, individually tuned diffusion baselines by up to +0.115 in F1 on hard LFR graphs. The margin remains significant after correction for multiple comparisons. Elsewhere, it does no better, and often worse. We confirm the same effect on two real school contact networks, where it outperforms the best tuned baseline on both schools (+0.041 and +0.044 F1). ...

Node Removal Effect on Polarization in Social Networks

Polarization and influence in online social networks

Bachelor thesis (2026) - I. Popp, A.L.D. Latour, M. Khosla

Online social networks can facilitate ideological polarization by shaping how individuals interact with and reinforce opinions. However, the specific roles played by different types of influential nodes in accelerating or delaying polarization remain insufficiently understood.

This study investigates the effect of targeted node removal on polarization convergence time under the Dandekar opinion dynamics model. Convergence time measures the number of steps it takes for the network to reach a stable polarized state. We compare a random node removal baseline against interventions targeting boundary spanners (nodes acting as inter-community bridges) and em provincial hubs (nodes acting as intra-community leaders). We quantify these effects in both synthetic networks generated using a stochastic block model (SBM) and empirical subgraphs sampled from VK, a Russian online social networking platform. We explicitly assume a simplified setting consisting of exactly two communities with initially weakly divergent opinions of 50 nodes each, focusing on the micro-level effect.

In SBM networks, removing boundary spanners accelerated convergence by 7.69 steps relative to random removal, whereas removing provincial hubs delayed convergence by 2.73 steps. Similar patterns were observed in empirical networks, where boundary spanners removal accelerated convergence by 1.55 steps and provincial hubs removal delayed convergence by 4.93 steps.

These findings suggest that boundary spanners sustain cross-community exposure that slows polarization, whereas provincial hubs amplify within-community opinion reinforcement, promoting convergence toward ideological extremes. However, boundary spanners exert a weaker influence in empirical networks, while provincial hubs exert a stronger one. This indicates that a realistic network structure can alter node impact. Future work should investigate whether these effects persist under alternative opinion-dynamics models, more nuanced representations of social influence, and larger-scale graphs. ...

Online social networks can facilitate ideological polarization by shaping how individuals interact with and reinforce opinions. However, the specific roles played by different types of influential nodes in accelerating or delaying polarization remain insufficiently understood.

This study investigates the effect of targeted node removal on polarization convergence time under the Dandekar opinion dynamics model. Convergence time measures the number of steps it takes for the network to reach a stable polarized state. We compare a random node removal baseline against interventions targeting boundary spanners (nodes acting as inter-community bridges) and em provincial hubs (nodes acting as intra-community leaders). We quantify these effects in both synthetic networks generated using a stochastic block model (SBM) and empirical subgraphs sampled from VK, a Russian online social networking platform. We explicitly assume a simplified setting consisting of exactly two communities with initially weakly divergent opinions of 50 nodes each, focusing on the micro-level effect.

In SBM networks, removing boundary spanners accelerated convergence by 7.69 steps relative to random removal, whereas removing provincial hubs delayed convergence by 2.73 steps. Similar patterns were observed in empirical networks, where boundary spanners removal accelerated convergence by 1.55 steps and provincial hubs removal delayed convergence by 4.93 steps.

These findings suggest that boundary spanners sustain cross-community exposure that slows polarization, whereas provincial hubs amplify within-community opinion reinforcement, promoting convergence toward ideological extremes. However, boundary spanners exert a weaker influence in empirical networks, while provincial hubs exert a stronger one. This indicates that a realistic network structure can alter node impact. Future work should investigate whether these effects persist under alternative opinion-dynamics models, more nuanced representations of social influence, and larger-scale graphs.

Polarisation and Influence in Online Social Networks

Diversity-Aware Reranking of Node2vec-based Recommendations in Social Networks

Bachelor thesis (2026) - T. Mihăilă, A.L.D. Latour, M. Khosla

Recent elections have demonstrated how social media contributes to political polarization, leading to real world consequences. Conventional "People you may know" algorithms used for new connection recommendations rely on structural similarity. These algorithms can recommend connections between people with similar opinions. In effect, this may expose users to opinions that reinforce their initial beliefs. Several methods have been introduced to reduce polarization based on interventions in the recommendation algorithm. However, none of the methods relies on post-processing the link scores. We aim to fill this gap by studying whether a diversity-aware reranking of a node2vec-based link prediction can decrease polarisation in synthetic graphs. We compare two reranking methods based on opinion and community diversity against the baseline. We look at prediction quality and polarisation change under two different opinion dynamic models: DeGroot and BCM. Our results indicate a negligible effect on polarization. However, the form of reranking has different effects based on the opinion dynamics model used. ...

Evaluating the Communication Reliability of Earth-Moon Satellite Constellations Using Model Counting

Master thesis (2026) - I. Băcălie, N. Yorke-Smith, A.L.D. Latour, Max Bannach, F.A. Oliehoek

As interest in lunar missions continues to grow, reliable communication between the Earth and the Moon becomes increasingly important. One way to achieve such communication is through satellite constellations, in which satellites communicate through inter-satellite links. However, evaluating the reliability of such constellations is a challenging problem due to continuously changing satellite positions.
In this thesis, we study how the communication reliability of Earth--Moon satellite constellations can be modelled and evaluated. We represent the communication network as a probabilistic graph, in which vertices correspond to satellites and edges correspond to communication links. We consider two reliability metrics: the probability that communication exists between Earth and Moon ground-connected satellites and the probability that the entire network is connected. We also study hop-constrained variants of these metrics, as in quantum communication the number of relay hops is limited.
To evaluate the communication reliability of Earth-Moon satellite constellations, we use projected weighted model counting. Starting from Walker constellations around both the Earth and the Moon, we propagate satellite positions over time. At selected discrete timestamps, we build probabilistic networks to represent the constellation at that point in time. We then encode communication properties as propositional formulas and apply projected weighted model counting to compute reliability. We compare two encodings: one based on reachability, adapted from existing work, and one that explicitly models communication paths.
We evaluate the proposed approach on multiple constellation configurations and timestamps. The experiments provide insights into the factors affecting communication reliability and the computational performance of the two encodings. The results show that projected weighted model counting can be used to evaluate the reliability of dynamic Earth--Moon satellite communication networks, while also highlighting scalability challenges for larger problem instances.
...

As interest in lunar missions continues to grow, reliable communication between the Earth and the Moon becomes increasingly important. One way to achieve such communication is through satellite constellations, in which satellites communicate through inter-satellite links. However, evaluating the reliability of such constellations is a challenging problem due to continuously changing satellite positions.
In this thesis, we study how the communication reliability of Earth--Moon satellite constellations can be modelled and evaluated. We represent the communication network as a probabilistic graph, in which vertices correspond to satellites and edges correspond to communication links. We consider two reliability metrics: the probability that communication exists between Earth and Moon ground-connected satellites and the probability that the entire network is connected. We also study hop-constrained variants of these metrics, as in quantum communication the number of relay hops is limited.
To evaluate the communication reliability of Earth-Moon satellite constellations, we use projected weighted model counting. Starting from Walker constellations around both the Earth and the Moon, we propagate satellite positions over time. At selected discrete timestamps, we build probabilistic networks to represent the constellation at that point in time. We then encode communication properties as propositional formulas and apply projected weighted model counting to compute reliability. We compare two encodings: one based on reachability, adapted from existing work, and one that explicitly models communication paths.
We evaluate the proposed approach on multiple constellation configurations and timestamps. The experiments provide insights into the factors affecting communication reliability and the computational performance of the two encodings. The results show that projected weighted model counting can be used to evaluate the reliability of dynamic Earth--Moon satellite communication networks, while also highlighting scalability challenges for larger problem instances.

Smashing Hitori

An analysis of the strengths and weaknesses of constraint programming paradigm Pumpkin

Bachelor thesis (2026) - L. Smits, A.L.D. Latour, T.J. Coopmans

Logic-based puzzle solving serves as a standard benchmark for evaluating computational paradigms; however, comparisons are frequently biased by the researcher’s familiarity with specific tools. To ensure an objective evaluation, we contribute to a collaborative benchmarking effort by analysing the suitability of the Constraint Satisfaction Problem (CSP) paradigm for solving the puzzle Hitori. Specifically, we use PUMPKIN [5]. Our results show that redundant constraints can have a positive impact on solve time, particularly when using a Lazy constraint strategy where dynamic constraint generation is employed. Additionally, our analysis of puzzle characteristics reveals that the Lazy strategy benefits significantly from high counts of non-adjacent duplicates, while the Distance strategy proves sensitive to deviations in the total black cell count. We demonstrate that this Lazy strategy significantly outperforms the Distance strategy, offering superior scalability with respect to puzzle size. Furthermore, in a comparative analysis against other computational paradigms, our solver proved more efficient than LP (GUROBI), Prolog, and SMT (Z3) implementations, ultimately ranking second, surpassed only by ASP (CLASP). This establishes CSP, specifically with the Lazy strategy, as a competitive approach for modelling and solving Hitori. ...

Eating Soup with a Fork: Solving Hitori with Integer Linear Programming

An Investigation into the Suitability of Integer Linear Programming for Solving Single-Solution Hitori Instances

Bachelor thesis (2026) - S. van Luenen, A.L.D. Latour, T.J. Coopmans

We study the advantages and disadvantages of using Integer Linear Programming for solving instances of the NP-complete puzzle Hitori. We do so as part of a larger comparison between different modelling-and-solving (M&S) paradigms that aims to find what solving techniques do (not) work for different paradigms so as to guide effective modelling. To this end we create three models: a path model that is quartic with regards to instance size, a quadratic and probabilistic optimised naive model, and an exponential duplicates model which uses proven properties of the game. We measure their runtime performance with and without different optimisations on 1000 10 x 10 Hitori instances, compare it to the runtime performance of models in four other M&S paradigms, and correlate properties of puzzle instances to runtime. We find that there is no performance difference between our optimised naive and duplicates models though our path model is significantly slower. Adding redundant constraints tends to slow models down. Our optimised naive model performs better than a Logic Programming model, but worse than a Satisfiability Modulo Theories, a Constraint Satisfaction Problem, and an Answer Set Programming model. Runtime correlates strongly with the number of non-unique tiles in the problem instance, though the correlation's direction differs per model. We conclude that ILP is an unintuitive and slow tool for solving Hitori instances, especially when compared to other M&S paradigms. ...

Solving Hitori puzzles using Satisfiablity Modulo Theories

Bachelor thesis (2026) - R.J. Rietdijk, A.L.D. Latour, T.J. Coopmans

Modelling-and-solving paradigms are widely used to solve combinatorial problems, but comparisons of these paradigms on identical problem instances remain scarce and are often incomplete or biased. As a results, finding the correct paradigm for a given logic problem can be difficult. In this study, we evaluate the performance of the Satisfiability Modulo Theories (SMT) paradigm for solving Hitori puzzles, as part of a larger effort to compare modelling-and-solving paradigms. During the evaluation, the running time of the solver will be used as the primary performance statistic. Across all evaluated encodings, a straightforward encoding using linear integer arithmetic shows the best and most stable performance. Alternative encodings and redundant constraints generally do not improve performance, often increasing encoding size or adding unnecessary complexity. Results show that solver runtime is dominated by puzzle size and encoding size, with puzzle structure having little influence. These findings show that SMT is a suitable paradigm for solving Hitori puzzles, as the rules naturally translate into a compact encoding without the need for added complexity or redundancy. ...

Solving Hitori

Applying Answer Set Programming to Hitori

Bachelor thesis (2026) - S. de Nooij, A.L.D. Latour, T.J. Coopmans

We investigate the performance of modelling and solving paradigms on the NP-complete puzzle Hitori. The choice of paradigm can have a significant impact on performance, but it is not always clear which paradigm is most suitable and why. We develop an ASP model to compare to models made in other paradigms. We investigate the effect of redundant constraints and research the effects of different puzzle properties on the solving time of our ASP model. In our experimental evaluation, we compare the ASP model to the models in otherparadigms from parallel studies. We find that redundant constraints have a negat-ve impact on performance. The solving times ofthe ASP model had little variance and was not or slightly correlated with various puzzle properties. Our results show that our approach solves 50-by-50 puzzles 38 times faster at PAR-2 than the second-best model and is the only model that never exceeds the specified timeout, showing the suitability of ASP to Hitori.

https://github.com/sappho3/Thesis-Hitori-shared
https://github.com/sappho3/Thesis-Hitori-ASP
...

Adding Ejection Chain to Nurse Rostering Simulated Annealing Solver

Master thesis (2025) - R. Katz, N. Yorke-Smith, A.L.D. Latour, J.T. van Essen, Lotte Berghman

Effectively solving the Nurse Rostering Problem enhances nurse moral and leads to improved patient care. While the use of ejection chains has shown promise in previous studies, studying their impact on real-life instances from two Dutch hospitals further deepens our understanding of their practical utility. This thesis begins by discussing the Nurse Rostering Problem (NRP) as presented in literature, highlighting the lack of research conducted in real-life settings. We then describe the context of Dutch hospitals by defining the relevant hard and soft constraints. Afterwards, a review of previous NRP solutions is provided, covering various solvers and techniques, including ejection chains. Ejection chains are an approach that combines multiple local search moves into a larger one, where each move forces (ejects) the next move. These moves are executed consecutively, forming a chain. Two recent ejection chain approaches were identified for further investigation, and a novel approach is also proposed.

Afterwards, the concept of the three ejection chains is explained. The first ejection chain, InfeasibleEC, is inspired by Kingston, and explores regions of the search space that contain infeasible rosters. It starts with a swap that introduces a single hard constraint violation, but results in lower roster penalty. Then the ejection chain proceeds with a series of repairs, where each repair fixes the previous violation, but may introduce a single new violation. Four different approaches to explore the search space were implemented. The second ejection chain described is EmulateEC by Curtois et al. This chain emulates human schedule planners by performing a series of vertical swaps. Each swap improves the penalty for one nurse but simultaneously increases the penalty for a different nurse. Thus, the next swap aims to improve the nurse whose penalty was worsened in the previous step. This series of swaps is repeated until a better roster is found. The third ejection chain, RuinRecreateEC, repeatedly performs ruin and recreate operations in sequence. The idea is that each ruin and recreate operation makes a relatively small change to the roster, and by repeating the process multiple times, the final roster undergoes substantial diversification while still retaining much of the structure of the original roster.

All three ejection chains underwent individual parameter tuning using a sequential greedy tuning strategy to identify suitable parameter configurations based on two instances from the same hospital. Once tuned, the chains were further tested by rerunning the chains on the initial instances as well as on additional instances from a second hospital to assess their robustness. We analysed the breakdown of the penalty based on the different soft constraints, and investigated the impact of the chains on finding better rosters throughout the running time. We could not find improvements on specific soft constraints penalties, nor on finding better rosters faster. A significance test was conducted to determine whether the chains significantly reduce the penalty of the final rosters. There was insufficient evidence to conclude a statistically significant improvement. ...

Effectively solving the Nurse Rostering Problem enhances nurse moral and leads to improved patient care. While the use of ejection chains has shown promise in previous studies, studying their impact on real-life instances from two Dutch hospitals further deepens our understanding of their practical utility. This thesis begins by discussing the Nurse Rostering Problem (NRP) as presented in literature, highlighting the lack of research conducted in real-life settings. We then describe the context of Dutch hospitals by defining the relevant hard and soft constraints. Afterwards, a review of previous NRP solutions is provided, covering various solvers and techniques, including ejection chains. Ejection chains are an approach that combines multiple local search moves into a larger one, where each move forces (ejects) the next move. These moves are executed consecutively, forming a chain. Two recent ejection chain approaches were identified for further investigation, and a novel approach is also proposed.

Afterwards, the concept of the three ejection chains is explained. The first ejection chain, InfeasibleEC, is inspired by Kingston, and explores regions of the search space that contain infeasible rosters. It starts with a swap that introduces a single hard constraint violation, but results in lower roster penalty. Then the ejection chain proceeds with a series of repairs, where each repair fixes the previous violation, but may introduce a single new violation. Four different approaches to explore the search space were implemented. The second ejection chain described is EmulateEC by Curtois et al. This chain emulates human schedule planners by performing a series of vertical swaps. Each swap improves the penalty for one nurse but simultaneously increases the penalty for a different nurse. Thus, the next swap aims to improve the nurse whose penalty was worsened in the previous step. This series of swaps is repeated until a better roster is found. The third ejection chain, RuinRecreateEC, repeatedly performs ruin and recreate operations in sequence. The idea is that each ruin and recreate operation makes a relatively small change to the roster, and by repeating the process multiple times, the final roster undergoes substantial diversification while still retaining much of the structure of the original roster.

All three ejection chains underwent individual parameter tuning using a sequential greedy tuning strategy to identify suitable parameter configurations based on two instances from the same hospital. Once tuned, the chains were further tested by rerunning the chains on the initial instances as well as on additional instances from a second hospital to assess their robustness. We analysed the breakdown of the penalty based on the different soft constraints, and investigated the impact of the chains on finding better rosters throughout the running time. We could not find improvements on specific soft constraints penalties, nor on finding better rosters faster. A significance test was conducted to determine whether the chains significantly reduce the penalty of the final rosters. There was insufficient evidence to conclude a statistically significant improvement.

Network Anonymization for Science

A simulated Annealing Approach

Bachelor thesis (2025) - E.D. Arsene, A.L.D. Latour

An increasing volume of data is being collected for research purposes, often containing sensitive information. Leaving out unique identifiers is insufficient to ensure anonymity. One approach to mitigating this risk is to modify the graph structure by adding or deleting edges. Existing heuristic approaches offer speed but limited optimality, while exact methods can achieve better anonymization quality but are computationally expensive. This creates a need for strategies that balance solution quality and efficiency. We propose a method based on simulated annealing that removes edges from the original network structure to achieve
anonymization under two formal privacy models: (n, m)-flavoured k-anonymity, which considers a node’s degree and number of incident triangles, and d-k-anonymity, which is based on the structure up to depth d of a node. Our method aims to balance running time and solution quality while adhering to an edge deletion budget. Experimental results on several real-world social network datasets show that Simulated Annealing can outperform traditional methods in terms of anonymity quality, particularly when higher modification budgets are
allowed. The running time is comparable to that of the baseline methods for smaller and medium-sized graphs but higher for larger graphs. ...

Comparative analysis of two network anonymization settings using Integer Linear Programming

Bachelor thesis (2025) - M.J.J.S. Erkemeij, A.L.D. Latour, C.E. Brandt

In network science, user privacy is a major concern when handling data such as social networks. These often contain sensitive data on \eg, individuals, companies, and governments. Therefore, it is essential to adequately anonymize this data before sharing or publishing to protect privacy and encourage open science. In this thesis we address this problem by studying k-anonymity in social networks. Specifically, we focus on (n,m)-k-anonymity measure. Building on an existing Integer Linear Program (ILP) encoding that achieves anonymity through edge deletion. We modify this encoding to add edges as well. Additionally, we propose a heuristic to improve the running time and memory usage of the modified ILP. We compare the anonymization settings, adding and deleting edges, on synthetic and real social network datasets, evaluating their running time, memory usage, and solution quality. This analysis provides insights into the trade-off between the two settings. ...

Network Anonymisation for Science

Improving (n, m)-greedy edge deletion anonymisation using global heuristic

Bachelor thesis (2025) - J.M. Matyja, A.L.D. Latour, C.E. Brandt

Network anonymisation is an essential procedure in processing data structured as graphs to achieve non-identifiability of participating entities. This quality is particularly desirable among networks representing stakeholders whose identity should not be compromised for ethical or legal reasons or when such structures are publicly disclosed. Extensive use of sensitive graph-like data contributed to the development of anonymisation methods that vary in terms of performance given the network topology or imposed constraints. We present modifications to an existing greedy algorithm using equivalence classes and discuss the rationale behind them. The experimental results obtained on networks from various scientific fields indicate that algorithms taking into account the size of equivalence classes in edge evaluation can consistently outperform the existing greedy algorithm in terms of the solution quality for larger number of available deletions. The proposed improvements also lead to comparable running time. ...

A Pseudo-Boolean Approach to Graph Anonymisation

Bachelor thesis (2025) - E.A. de Groot, A.L.D. Latour, C.E. Brandt

This work proposes a pseudo-Boolean optimisation model that computes the minimum number of edge deletions needed to fully anonymise a graph, according to the (n,m)-k-anonymity measure. We survey the usability of this model, by comparing it with an unpublished Integer Linear Program (ILP) optimisation model in terms of running time and memory consumption. For both models, we provide a comparison between two different model solvers. We find a noticeable difference in running time and quality of the solution of the different model solvers. The experiments suggest that both running time and memory consumption of solving the pseudo-boolean model are greater than solving the ILP model by a constant factor. This opens up the possibility for the model to be used in tandem with a pseudo-Boolean proof verifier to provide certificates of solution optimality. ...

A Constraint Programming Approach to Optimal Network Anonymization

Bachelor thesis (2025) - A. Ionita, A.L.D. Latour, C.E. Brandt

In the age of the internet, social networks are being used to study different phenomena, such as segre- gation, disease spread, or even peer influence. This introduces the need to protect the privacy of the in- dividuals that are part of these networks, a problem known in the field of network science by the name of network anonymization. This involves altering the initial network using different methods such as adding, removing, or altering edges, in order to pre- vent attackers from identifying specific individuals. One aspect that has been studied is the data util- ity of the anonymized network: if we alter the ini- tial network too much, its usability for studies is lowered. Thus, in this work, we propose a con- straint programming approach that minimizes the anonymization cost, in turn maximizing the data utility of the final network. We introduce a new iso- morphic_neighborhoods constraint and show that, for small(n < 20) or highly dense graphs, we can guarantee the maximum data utility, by adding the fewest number of edges that guarantee anonymity. ...

Multi-Meal, Multi-Constraint Recommender System to Optimize Grocery Budget and Waste

Master thesis (2025) - A. Mereuta, N. Yorke-Smith, A.L.D. Latour, C. Lofi, Catalin Ștefan Cernat

Grocery delivery company Picnic has identified affordable meal planning, especially in the context of recipe-based shopping, as an ongoing challenge faced by its customers. While recipes enhance customer experience and operational efficiency, Picnic currently lacks an algorithmic system to recommend recipes in a cost-effective way. This research addresses this gap by proposing a scalable optimization approach that minimizes the total cost of grocery products across a set of selected and recommended recipes.

To address this challenge, the paper formulates the problem as a Mixed Integer Linear Program (MILP), chosen for its ability to guarantee a globally optimal solution. The MILP model selects a combination of recipes and corresponding products that promote ingredient reuse and bulk purchasing to minimize overall cost. To complement this, the study implements two meta-heuristic models: a standalone Genetic Algorithm (GA) and a hybrid GA+MILP model, where GA selects recipes and MILP optimally assigns products. The optimization process also incorporates real-world constraints such as minimizing food waste and ensuring cuisine consistency.

Experimental results show that the MILP formulation consistently achieves substantial cost savings compared to a naive baseline, which selects the same set of recipes but does not optimize product choices across them. Gurobi outperforms CPLEX in solve time while maintaining identical solution quality. The standalone GA yields near-optimal solutions in seconds, while the hybrid GA+MILP model improves accuracy further, albeit with increased computational cost. When constraints such as waste reduction are added, the system remains effective, with one multi-objective variant reducing waste at minimal cost increase.

The findings confirm that cost-aware recipe recommendation is feasible, efficient, and adaptable. The proposed system offers a foundation for future extensions toward sustainability, personalization, and real-world deployment at Picnic. ...

Grocery delivery company Picnic has identified affordable meal planning, especially in the context of recipe-based shopping, as an ongoing challenge faced by its customers. While recipes enhance customer experience and operational efficiency, Picnic currently lacks an algorithmic system to recommend recipes in a cost-effective way. This research addresses this gap by proposing a scalable optimization approach that minimizes the total cost of grocery products across a set of selected and recommended recipes.

To address this challenge, the paper formulates the problem as a Mixed Integer Linear Program (MILP), chosen for its ability to guarantee a globally optimal solution. The MILP model selects a combination of recipes and corresponding products that promote ingredient reuse and bulk purchasing to minimize overall cost. To complement this, the study implements two meta-heuristic models: a standalone Genetic Algorithm (GA) and a hybrid GA+MILP model, where GA selects recipes and MILP optimally assigns products. The optimization process also incorporates real-world constraints such as minimizing food waste and ensuring cuisine consistency.

Experimental results show that the MILP formulation consistently achieves substantial cost savings compared to a naive baseline, which selects the same set of recipes but does not optimize product choices across them. Gurobi outperforms CPLEX in solve time while maintaining identical solution quality. The standalone GA yields near-optimal solutions in seconds, while the hybrid GA+MILP model improves accuracy further, albeit with increased computational cost. When constraints such as waste reduction are added, the system remains effective, with one multi-objective variant reducing waste at minimal cost increase.

The findings confirm that cost-aware recipe recommendation is feasible, efficient, and adaptable. The proposed system offers a foundation for future extensions toward sustainability, personalization, and real-world deployment at Picnic.

Breaking Weighted Model Counting Solvers Using EXTREMEgen

Generating WMC instances for fuzzing

Bachelor thesis (2025) - B.P. Snelten, A.L.D. Latour, M. Skrodzki

Weighted model counting (WMC) solvers play a key role in Bayesian inference applications, used for medical diagnosis [17] [16] and risk assessment [14]. Ongoing efforts to improve WMC solver developers aim to develop a fuzzer to identify bugs. This research is aimed at enhancing the quality of this fuzzer by developing an additional method to generate WMC instances, EXTREMEgen [21]. The functionality of this new approach relies on generating practical instances from Bayesian networks and breaking the solvers using extreme weights. Our empirical experiments show that EXTREMEgen exposes bugs in state-ofthe-art WMC solvers. The generated instances are solved fast enough to be usable in fuzzing. However, generation speed needs to be optimized to become practical for fuzzing. When optimized, EXTREMEgen could become the first generator of WMC instances specifically designed for fuzzing. ...

Feature-Driven SAT Instance Generation

Benchmarking Model Counting Solvers Using Horn-Clause Variations

Bachelor thesis (2025) - V. Jurišić, A.L.D. Latour

Model counting (#SAT) is a fundamental problem in theoretical computer science with applications in probabilistic reasoning, reliability analysis, and verification tasks. Despite advancements in solvers and #SAT instance generation, existing benchmarks fail to fully capture the diversity of structural features that influence solver performance. This paper introduces a feature-driven #SAT instance generator that systematically varies the fraction of Horn clauses across the full range (0% to 100%), enabling a rigorous evaluation of solver performance. Results reveal a “U-shaped” correlation between solve times and Horn-clause fractions and a strong relationship with model counts, exposing computational bottlenecks. Our contributions include the generator design, experimental validation across multiple solvers, and insights into improving solvers for challenging structural configurations, advancing #SAT research. ...

Methods for Evaluating the Similarity of Fuzzers for Model Counters

Bachelor thesis (2025) - C. Soare, A.L.D. Latour, M. Skrodzki

Model Counting solvers are critical in many domains. One way of validating them is through fuzzing. However, current fuzzing approaches lack systematic methods to evaluate how different test generators compare in bug-triggering behavior. This paper proposes three methods for evaluating fuzzer similarity: black-box analysis, differential testing, and white-box analysis.

A similarity metric was defined using differential testing, incorporating solver behavior, solve time, and count differences. A case study was conducted on three fuzzers across numerous configurations and four SOTA model counters, followed by an analysis of correlations between CNF feature distribution similarities and fuzzer behavioral similarities.

The results show moderate correlations between graph structure features (minimum variable node degrees) and fuzzer similarity, while clause balance features show negative correlations. However, the method proves inconclusive for selecting dissimilar fuzzers due to limited incorrect counts in our dataset, uncertainty about crash causes, and the similarity metric's inherent subjectivity. Further research is needed, either by expanding the scope of the experiment with more diverse fuzzers, CNF features, and counters, or by pursuing another method recommended by this study. ...