Analysis of Molecular Motion using Non-Linear Dimensionality Reduction

H. Stamati, “Analysis of Molecular Motion using Non-Linear Dimensionality Reduction,” Master's thesis, Rice University, Houston, TX, 2007.


Understanding the main stable shapes and transitions of biomolecules is key to solving problems in computational biology. Because simulated molecular samples are very high-dimensional, it is important to classify them using very few parameters. Traditionally, this requires empirical reaction coordinates to be devised by an expert. This work, in contrast, automates the classification by applying an algorithmic tool for non-linear dimensionality reduction, called ScIMAP, that requires minimal user intervention. A comparison with the most popular linear dimensionality reduction technique, Principal Components Analysis, shows how non-linearity is crucial for capturing the main motion parameters. The contribution is validated by several results on increasingly complex systems, ranging from the motion of small peptides to the folding of large proteins. In all cases, only 1–3 parameters are sufficient to characterize the motion landscape and prove as excellent reaction coordinates.

PDF preprint: