Integrating single-cell multi-omic data is crucial for comprehensive biological discovery, yet it remains challenging due to the weak correlation between modalities, data heterogeneity, and stringent privacy regulations. Conventional integration methods that depend on shared feat
...
Integrating single-cell multi-omic data is crucial for comprehensive biological discovery, yet it remains challenging due to the weak correlation between modalities, data heterogeneity, and stringent privacy regulations. Conventional integration methods that depend on shared features or matched cells, which are rarely available in practice. While some diagonal integration approaches might mitigate some of these limitations, they are sensitive to noise, prone to overfitting, and challenging to validate, especially in the absence of centralized data access. This thesis introduces Federated Matching xcross modalities via Fuzzy smoothed embeddings (MaxFuse), a novel adaptation of MaxFuse within a Federated Learning (FL) framework, which enables privacy-preserving diagonal integration through fuzzy smoothing, federated Canonical Correlation Analysis (CCA), and iterative matching without exchanging raw data. We validate Federated MaxFuse on benchmark single-cell datasets, demonstrating that it achieves matching accuracy and embedding quality comparable to centralized baselines across supervised and unsupervised metrics. These findings establish Federated MaxFuse as a practical and scalable solution for privacy-preserving integration of multi-omic data, enabling robust cross-institutional analyses under real-world constraints.