Search

AlphaFold-Multimer Explained

Introduction

If you need some background information about protein folding, refer to Introduction to Protein Folding

AlphaFold Multimer (AF2-Multimer)

Multimeric structure prediction model

Major differences compared to vanilla AF2

1.
Multi-chain featurization
Since AF2-Multimer deals with more than one chain, authors used three features to represent multimeric state.
asym_id: unique integer per chain
entity_id: unique integer for each set of identical chains
sym_id: unique integer within a set of identical chains
example: A3B2 stoichiometry
2.
Multi-chain cropping
Since the number of residues that can be handled is limited (by memory & computation), authors applied two strategies to crop the structure.
Per-chain contiguous cropping (in sequence space)
The chains are first randomly shuffled and then some contiguous regions are selected per chain until the total number of residues reach NresN_{\text{res}} budget.
This cropping ensures more than certain amount of each chain to be included.
Inter-chain spatial cropping (interface-biased, structure space)
This cropping targets interface regions, which are the spatially nearest neighbors, as defined by distances between Cα\alpha coordinates.
3.
Symmetry handling
Greedy heuristic approach to deal with multi-chain permutation alignment
When computing the losses for homomeric components, permutation symmetry becomes important. Since the order of the predicted chains and the ground-truth chains may differ between chains with identical sequences, authors used greedy approach to find good chain mappings.
4.
Loss
FAPE loss cutoff
intra-chain: 10Å (same as vanilla AF2)
inter-chain: 30Å (new in AF2-Multimer)
(new) chain center-of-mass loss term
push apart different chains (clamped if the error is -4Å or greater)
Goal: to prevent the model from predicting overlapping chains
(modified) clash loss
average, rather than sum
Goal: stabilize the loss (since there maybe many clashes if Ncycle is small - due to black hole initialization)
5.
Architecture
Template stack
Swapped the order of attention and triangular multiplicative update layers
Changed the aggregation of template embeddings
Evoformer
Moved the outer product mean to the start of the Evoformer block

Results

Structures
Mean DockQ score
Confidence score (interface pTM) vs. DockQ score
AF-multimer predicts better individual chain than AF2?

Discussion

Multi-chain version of AF2, with some modifications
Limitations
Antibody-antigen interactions is not modeled well yet.
Cannot model other biomolecules (nucleic acids, small molecules, ions, …)
e.g. 3L1P, 7XFA

Reference