Streamlined Workflow from Instrument to HPC
The complexity of many contemporary scientific workflows is well-known, both in the laboratory setting and the computational processes. One discipline where this is particularly true is biochemistry, and in 2017 the Nobel Prize in Chemistry was awarded for the development of cryo-electron microscopy (cryo-EM). This allows researchers to "freeze" biomolecules in mid-movement and visualize three-dimensional structures of them, aiding in understanding their function and interaction which is, of course, essential in drug discovery pipelines. However, cryo-EM unsurprisingly produces vast quantities of data which, when combined with the storage capabilities and processing capabilities available from High Performance Computing simulations, produces detailed 3D models of biological structures at sub-cellular and molecular scales.
Optimising the cyro-EM workflow is a significant challenge, from image acquisition with transmission electron microscopes and direct electron detectors, through to the preprocessing tasks of motion correction, particple picking and extraction, CTF estimation, then image classification and curation, image sharpening and refinement, and finally structure modelling. On the computational side, the right selection and balance of storage, network, GPU-enabled and optimised software is requisite.
Following previous presentations at eResearchAustralasia that have mapped the innovations of the University of Melbourne's HPC system, Spartan, an exploration is provided here on how a combination of Spectrum Scale storage, a significant LIEF-funded GPU partition, and the use of cryoSPARC contributes to rapid solutions and workflow simplifcation for cryo-EM structures, including SARS-CoV-2. Optimising the cyro-EM workflow is a significant challenge, from image acquisition, through to the preprocessing tasks of motion correction, participle picking and extraction, Contrast Transfer Function estimation, image classification and curation, sharpening and refinement, and finally structure modelling. On the computational side, there is the right selection of storage, network, GPU-enabled and optimised software. This short presentation will outline these steps and choices in a manner that is useful for other institutions.