Last week, Sjors Scheres at MRC-LMB announced at twitter that Model-Angelo developed in his group is released at GitHub. It used deep machine learning approach to build cryo-EM protein structures from scratch. Model-Angelo supports two modes, with and without sequence, to build 3D structures.
The installation was pretty simple and easy. Docker-based anaconda/miniconda software makes it easy for users to run Model-Angelo under user’s home directory. Although Model-Angelo suggests Nvidia GTx 2080 GPU cards or newer for computation, CUDA 11 is also a requirement. My workstation has been using Nvidia driver 438 (CUDA 10.1) for 3 years. As it is quite stable for all users and all programs, I firstly hesitated to upgrade it to Nvidia driver 480 or newer for CUDA 11. I gave a try and I found out CPUs alone are also okay but much slower. CPU only took about 10.5 hours to finish the job. While I used a new Nvidia drive with CUDA11, the computation time is 0.56 hours.

Two maps were used to test Model-Angelo. One is a 2.4 Å map of Ecoli glutamine synthetase (GS). The other one is a 2.9 Å map of 723-aa malate synthase G (MSG). Both tests were amazing and I wrote the Ecoli GS case here.
Ecoli GS is D6-fold dodecamer (12mer). I gave a copy of GS sequence for a quick test to know how ModelAngelo handles multimers. It turns out the quality is pretty good. Structures below are Ecoli GS cryo-EM structures made by Model-Angelo (red) and myself (blue). The backbone RMSD is 0.8 Å (CE align in PyMOL). Most flexible loops, especially the loops on the outer surface of hexameric ring are well built by Model-Angelo. I am very surprised but also satisfied that how powerful Model-Angelo is.

Although Model-Angelo did a good job for model building, the structural quality needs to be improved. It has clash score 76.43, 7.5% rotamer outliers, and 2.76% outliers in the Ramachandran plot. So I did a quick 1-cycle refinement by Phenix for this model+map, the refined structure is much improved (see cyan GS structure below). The clash score now is 12.97. Outliers of Ramachandran plot and roatmers are 0.93% and 1.81%, respectively.


The monomeric subunits shown above for the 3 Ecoli GS present great consistences in structural regions and the loops. I don’t go for the sidechains residue-by-residue yet, but the quality is impressive. One can get a publishable structure within a half day by combining live data processing, model-angelo, and automatic refinement.
Here is a summary of the built structures.
Analysis | Model-angelo | Model-angelo + Phenix | Manually curated |
Chains | 12 | 12 | 12 |
Protein residues | 5559 | 5559 | 5628 |
MolProbity score | 3.48 | 2.04 | 1.56 |
Clash score | 76.43 | 12.97 | 7.61 |
Ramachandran favored | 92.77% | 95.56% | 97.22% |
Ramachandran outliers | 2.76% | 0.93% | 0 |
Rotamer outliers | 7.52% | 1.81% | 0 |
CaBLAM outlier | 0% | 0.14% | 0 |