Pv
P.O. van Egmond
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
1 records found
1
Memory-aware optimization of mass-univariate statistical inference on EEG datasets
Accelerating the statistical testing pipeline of the Neurophysiological Biomarker Toolbox using memory-aware data layouts, vectorization, and native execution
This paper investigates memory-aware optimization of mass-univariate EEG statistical inference in the Neurophysiological Biomarker Toolbox. A vectorized Python implementation and a native Rust backend are evaluated as optimized alternatives to the existing NumPy/SciPy-based statistical testing pipeline. The optimized implementations reorganize EEG biomarker data for cohort-based access, improving support for cache locality, SIMD execution, and parallel processing. Synthetic benchmarks show speedups of up to 452.3x for the vectorized Python implementation and up to 486.1x for the Rust backend. The optimized implementations also substantially reduce sensitivity to increasing biomarker counts, resulting in much weaker runtime growth across the measured benchmark space. Profiling shows increased SIMD density and CPU utilization, while cache behaviour improves only modestly. These results suggest that the primary limitation is not the statistical operation itself, but the overhead introduced by how the workload is structured and executed. Much of the available speedup can therefore be achieved by expressing the computation as larger batched and vectorized operations.
...
...
This paper investigates memory-aware optimization of mass-univariate EEG statistical inference in the Neurophysiological Biomarker Toolbox. A vectorized Python implementation and a native Rust backend are evaluated as optimized alternatives to the existing NumPy/SciPy-based statistical testing pipeline. The optimized implementations reorganize EEG biomarker data for cohort-based access, improving support for cache locality, SIMD execution, and parallel processing. Synthetic benchmarks show speedups of up to 452.3x for the vectorized Python implementation and up to 486.1x for the Rust backend. The optimized implementations also substantially reduce sensitivity to increasing biomarker counts, resulting in much weaker runtime growth across the measured benchmark space. Profiling shows increased SIMD density and CPU utilization, while cache behaviour improves only modestly. These results suggest that the primary limitation is not the statistical operation itself, but the overhead introduced by how the workload is structured and executed. Much of the available speedup can therefore be achieved by expressing the computation as larger batched and vectorized operations.