Time-Dependent Multi-Light-Source Image Classification Combined With Automated Multidimensional Protein Phase Diagram Construction for Protein Phase Behavior Analysis

More Info
expand_more

Abstract

Image-based protein phase diagram analysis is key for understanding and exploiting protein phase behavior in the biopharmaceutical field. However, required data analysis has become a notorious time-consuming task since high-throughput screening approaches were implemented. A variety of computational tools have been developed to support analysis, but these tools primarily use end point visible light images. This study investigates the combined effect of end point and time-dependent image features obtained from cross-polarized and ultraviolet light features, supplementary to visible light, on protein phase diagram image classification. In addition, external validation was performed to evaluate the classification algorithm's applicability to support protein phase diagram scoring. The predicted protein phase behavior classes were subsequently used to automatically construct multidimensional protein phase diagrams to prevent image information loss without complicating the used image classification algorithm. Combining end point and time-dependent features from 3 light sources resulted in a balanced accuracy of 86.4 ± 4.3%, which is comparable to or better than more complex classifiers reported in literature. External validation resulted in a correct formulation classification rate of 91.7%. Subsequent automated construction of the multidimensional protein phase diagrams, using predicted classes, allowed visualization of details such as crystallization rate and protein phase behavior type coexistence.