High-resolution microscopy techniques, such as Single-Molecule Localization Microscopy (SMLM) and Cryogenic Electron Microscopy (Cryo-EM), can utilize particle fusion or averaging to reconstruct a macromolecular structure of increased signal-to-noise ratio and of potentially
...
High-resolution microscopy techniques, such as Single-Molecule Localization Microscopy (SMLM) and Cryogenic Electron Microscopy (Cryo-EM), can utilize particle fusion or averaging to reconstruct a macromolecular structure of increased signal-to-noise ratio and of potentially higher resolution. This process assumes that all fused particles are structurally equal. Structural heterogeneity, however, is often present due to biological variations and should not be ignored. In particularly continuous and subtle conformational changes present in the data lead to undesired blurring of the reconstruction. This thesis develops methods to detect continuous structural heterogeneity and to exploit it for more faithful reconstructions, enabling more accurate interpretations and insights into molecular structures.
In Chapter 2, we propose a method to detect continuous structural heterogeneity in SMLM datasets based on an all-to-all pairwise comparison of the found structures. The method is applied to both experimental and simulated data, where continuous variations such as the height of 3D DNA origami tetrahedrons and the radius of 2D Nuclear Pore Complexes (NPCs) are detected. The chapter highlights how accounting for these structural variations leads to more reliable particle fusion and reconstruction.
In Chapter 3, we propose a Point Cloud Variational Auto-Encoder (PCVAE) that operates directly on 2D and 3D localization data to detect structural heterogeneity. Unlike common neural networks that rely on pixelated images, our method utilizes raw localization coordinates. This not only reduces the required memory but also has low computational complexity and thus allows scalability to many structures. In contrast to multi-dimensional scaling approaches, where the computational complexity scales quadratically, here it remains linear with the number of particles. Our method is capable of identifying multiple modes of variation and reveals nanometer-scale changes such as radius and height variations in both simulated and experimental datasets.
In Chapter 4, we propose a method to detect continuous structural heterogeneity in Cryo-EM datasets. Recent approaches rely on machine learning models that often require large training datasets and careful tuning of hyperparameters.
%These machine learning methods are often hindered by a lack of interpretability and consistency due to the non-linear mapping onto a low-dimensional latent space.
In contrast, our method detects underlying continuous variations in 2D projections by pairwise comparison of images within orientation classes. The approach reconstructs intermediate conformational states representing the continuous structural heterogeneity in synthetic SARS-CoV-2 spike protein data, simulated under ideal conditions. More realistic simulations, incorporating varying defocus per particle and radiation damage, do not lead to the same favourable results, still posing a challenge for future research.