ML

M.C.H. Li

info

Please Note

2 records found

In recent years, the rapid advancements in big data, machine learning, and artificial intelligence have led to a corresponding rise in privacy concerns. One of the solutions to address these concerns is federated learning. In this thesis, we will look at the setting of vertical federated learning based on tree models. We have built a system that can do both entity resolution through private set intersection (PSI) and vertical federated learning (VFL). In this system, we have implemented an optimisation to pre-sort the data per feature before the start of VFL. We have also created a privacy framework, where we define four levels of privacy. This optimisation did not affect the privacy level of the system. In our results, we have seen that pre-sorting the data lowers the overall training time. How much depends on the number of entities and features of the passive party. We observe from our results that we estimate the speed-up to be 0.3654 seconds per feature and 0.2093 seconds per 1000 entities. ...
Recent times once more informed us on the relevance of capable online collaborative tools. For our online collaborative XML editor, we have looked into technologies for constrained block editing which, obeying schemas such as with XML, permit on- and offline users or agents to add, delete, copy, move, split and merge blocks of text. To that end, we studied the current state of Operational Transformations (OT) and Conflict-free Replicated Data Types (CRDT). Furthermore, after selection of the best-ranking enabling technology, we studied existing CRDT implementations for unstructured texts, and extended a Logoot-based CRDT to implement on-and offline split and merge block support. We designed a proof of concept and created a scientific prototype to deliver a proof of concept stability, reproducibility and convergence. For now, we excluded undo and redo operations. Given these conditions, we deliver emperical evidence of our implementation to converge under all circumstances, including split and acyclic block mergers. Finally, we give an outlook and design recommendations for production implementations, and suggestions for tackling the problem of cyclic references in block mergers. ...