Search

AlphaFold-Multimer Explained

Created
2024/06/02 07:13
Language
πŸ‡ΊπŸ‡Έ
Latest checked date
2024/10/18
Status
Done
Type
Article
1 more property

Introduction

If you need some background information about protein folding, refer to Introduction to Protein Folding

AlphaFold Multimer (AF2-Multimer)

Multimeric structure prediction model

Major differences compared to vanilla AF2

1.
Multi-chain featurization
Since AF2-Multimer deals with more than one chain, authors used three features to represent multimeric state.
β€’
asym_id: unique integer per chain
β€’
entity_id: unique integer for each set of identical chains
β€’
sym_id: unique integer within a set of identical chains
example: A3B2 stoichiometry
2.
Multi-chain cropping
Since the number of residues that can be handled is limited (by memory & computation), authors applied two strategies to crop the structure.
β€’
Per-chain contiguous cropping (in sequence space)
The chains are first randomly shuffled and then some contiguous regions are selected per chain until the total number of residues reach NresN_{\text{res}} budget.
This cropping ensures more than certain amount of each chain to be included.
β€’
Inter-chain spatial cropping (interface-biased, structure space)
This cropping targets interface regions, which are the spatially nearest neighbors, as defined by distances between CΞ±\alpha coordinates.
3.
Symmetry handling
β€’
Greedy heuristic approach to deal with multi-chain permutation alignment
When computing the losses for homomeric components, permutation symmetry becomes important. Since the order of the predicted chains and the ground-truth chains may differ between chains with identical sequences, authors used greedy approach to find good chain mappings.
4.
Loss
β€’
FAPE loss cutoff
β—¦
intra-chain: 10Γ… (same as vanilla AF2)
β—¦
inter-chain: 30Γ… (new in AF2-Multimer)
β€’
(new) chain center-of-mass loss term
push apart different chains (clamped if the error is -4Γ… or greater)
Goal: to prevent the model from predicting overlapping chains
β€’
(modified) clash loss
average, rather than sum
Goal: stabilize the loss (since there maybe many clashes if Ncycle is small - due to black hole initialization)
5.
Architecture
β€’
Template stack
Swapped the order of attention and triangular multiplicative update layers
Changed the aggregation of template embeddings
β€’
Evoformer
Moved the outer product mean to the start of the Evoformer block

Results

Structures
Mean DockQ score
Confidence score (interface pTM) vs. DockQ score
AF-multimer predicts better individual chain than AF2?

Discussion

β€’
Multi-chain version of AF2, with some modifications
β€’
Limitations
β—¦
Antibody-antigen interactions is not modeled well yet.
β—¦
Cannot model other biomolecules (nucleic acids, small molecules, ions, …)
e.g. 3L1P, 7XFA

Reference