Protein structure prediction team achieved top rankings

2/15/2023

By Stephanie Dascola


CASP15 is a bi-annual competition assessment of methods of protein structure modeling. Independent assessors then compared the models with experiments, and the results and their implications were discussed at the CASP15 Conference, held December 2022, in Turkey.

A joint team with members from the labs of Dr. Peter Freddolino and Dr. Yang Zhang took first place in the Multimer and Interdomain Prediction categories, and was again the top-ranked server in the Regular (domains) category according to the CASP assessor’s criteria.

These wins are well-earned. Freddolino noted, “This is a highly competitive event, against some of the very best minds and powerful companies in the world.”

The Zhang/Freddolino team competed against nearly 100 other groups which include other academic institutions, as well as major cloud and commercial companies. Groups from around the world submitted more than 53,000 models on 127 modeling targets in 5 prediction categories. 

“Wei’s predictions did amazingly well in CASP15!,” said Freddolino. Wei Zheng, Ph.D., is a lab member and a research fellow with the Department of Computational Medicine and Bioinformatics (DCMB). 

Zheng said that the team participates in the regular protein structure prediction and protein complex structure prediction categories. “The results are assessed as regular protein domain modeling, regular protein inter-domain modeling, and protein complex modeling. In all categories, our models performed very well!” 

The technology that supported this impressive work 

The resources to achieve these results were grant-funded, which allowed the team to leverage a number of university resources, including:  

  • The Lighthouse High-Performance Computing Cluster (HPC) service. Lighthouse is managed by the Advanced Research Computing (ARC) team, and ARC is a division of Information and Technology Services (ITS). 
  • The algorithms were GPU-intensive and run on the Great Lakes HPC Cluster. Graphics processing units (GPUs) are specialized processors designed to accelerate graphics rendering. The Great Lakes cluster provided additional space for running compute cycles. Kenneth Weiss, IT project manager senior with DCMB and HITS, said that many of the algorithms used by Zheng benefited from the increased performance of being able to compute the data on a GPU.
  • Multiple storage systems, including Turbo Research Storage. High-speed storage was crucial for storing AI-trained models and sequence libraries used by the methods developed by Zhang, Freddolino, and Zheng called D-I-TASSER/DMFold-Multimer. 
  • Given the scale of the CASP targets, the grant-funded compute augmented capacity by utilizing the Great Lakes cluster, Freddolino and his team took advantage of the allocations provided by the ITS U-M Research Computing Package (UMRCP) and the HITS Michigan Medicine Research Computing Investment (MMRCI) programs which defrayed the cost of computing substantially.
  • The collaboration tool Slack was used to keep Freddolino and Zheng in close contact with ARC and the DCMB teams. This provided the ability to deal with issues promptly, avoiding delays that would have had a detrimental impact on meeting CASP targets.

Technology staff from ARC, DCMB, and Health Information and Technology Services (HITS) provided assistance to the research team. All of the teams helped with the mitigation of bottlenecks that affected speed and throughput that Zheng needed for results. Staff also located and helped leverage resources including those on Great Lakes, utilizing available partitions and queues on the clusters.

“Having the flexibility and capacity provided by Great Lakes was instrumental in meeting competition deadlines,” said Weiss.

DCMB staff and the HITS HPC Teams team took the lead on triaging software problems giving Freddolino’s group high priority.

ARC Director Brock Palen provided monitoring and guidance on real-time impact and utilization of resources. “It was an honor to support this effort. It has always been ARC’s goal to take care of the technology so researchers can do what they do best. In this case, Freddelino and Zheng knocked it out of the park.” 

Jonathan Poisson, technical support manager with DCMB, was instrumental in helping to select and configure the equipment purchased by the grant. “This assistance was crucial in meeting the tight CASP15 targets, as each target is accompanied by a deadline for results.” 

Read more on the Computational Medicine and Bioinformatics website and the Department of Biological Chemistry website.

Related presentation: D-I-TASSER: Integrating Deep Learning with Multi-MSAs and Threading Alignments for Protein Structure Prediction

The resources to achieve these results were provided by an NIH-funded grant (“High-Performance Computing Cluster for Biomedical Research,” SIG: S10OD026825).