BioExcel-2 Deliverable D3.3 – Use Case Progress Report

Main Authors: Proeme, Arno, Villa, Alessandra, Jandova, Zuzana, Bonvin, Alexandre, Gapsys, Vytautas, Hospital, Adam, Westermaier, Yvonne, Groenhof, Gerrit, Morozon, Dmitry, Modi, Vaibhav, Ippolilit, Emiliano
Format: Report publication-deliverable eJournal
Terbitan: , 2020
Online Access: https://zenodo.org/record/5511684
Daftar Isi:
  • This report describes progress made during the first 18 months of BioExcel-2 executing the project’s planned demonstrator research projects (the “Use Cases”) as well as research work done by BioExcel in response to the COVID-19 pandemic. The report demonstrates how important challenges in biomolecular modelling and simulation can be tackled effectively using the software and expertise developed by the CoE and by exploiting large-scale computing resources. Success stories and lessons learned are highlighted, in particular where these have had the most significant impact on advancing what can be achieved using (pre-)exascale computing. With regards to COVID-19, core applications and workflow tools were rapidly adapted to perform large-scale mutational and free energy analyses to elucidate the evolutionary and structural relationships of SARS-CoV2 to other coronavirus species and strains as well as host sensitivity, viral adaptation, and the cell entry mechanism of the virus. This approach as well as molecular dynamics simulation and docking were used to screen for large numbers of potential therapeutic candidates and to study the inhibitory mode of action of potential drug targets. Progress has been made in Use Cases 1 and 3 developing and evaluating computational pipeline protocols that use various combinations of the core BioExcel applications GROMACS, HADDOCK and PMX to perform free energy and docking calculations for the design of antibody-based drugs and other therapeutics, including mutational analysis. The workflow tools developed by the project are enabling these pipelines to execute as containerised workflows on HPC resources, demonstrating how the work being done is set to achieve the ultimate goal of shortening the overall time and effort taken to develop new and better therapeutics through large-scale computational biomolecular modelling and simulation. In Use Case 2, parallel execution of containerised HADDOCK and the necessary data movement underpinning this execution is similarly orchestrated by PyCOMPSs, demonstrating how (pre-)exascale HPC resources will be able to be used to study significant fractions of the interactions of biomolecules encoded by the human genome. In Use Case 4, QM/MM simulation with GROMACS and CP2K using large-scale computing resources are set to improve electrospray ionization mass spectrometry - a key analytical technique in proteomics - and, combined with free energy calculations using PMX, are enabling the design of fluorescent proteins enabling the high-resolution monitoring of cellular functions, gene expression, protein-protein interactions, intra-cellular interactions in living systems as well as understanding and finding novel strategies to tackle disease.