Skip to content

Foldseek 4-645b789

Compare
Choose a tag to compare
@martin-steinegger martin-steinegger released this 28 Dec 14:12
· 942 commits to master since this release

Release at a glance: better hit ranking, critical bug fix, structure clustering, smaller database size and updated AlphaFold Databases.

Features

  • foldseek databases now offers the AlphaFoldDB v4 databases.
  • We have improved hit ranking in Foldseek by multiplying the 3Di/AA bit-score by the geometric mean of alignment LDDT and TMscore, resulting in more accurate rankings.
  • The --format-output prob parameter now returns the probability of homology.
  • The --format-mode 5 flag generates PDB files with all Cα atoms superimposed based on the aligned coordinates onto the query structure.
  • We have added a faster computation for LDDT, available with the --format-output lddt,lddtfull flag. The lddt flag outputs the average LDDT score for all Cα, while the lddtfull flag outputs a string of LDDT scores for each Cα.
  • The --coord-store-mode 2 parameter allows for storage of C-alpha lossless in compressed format.
  • TMalign mode (--alignment-type 1) now uses the 3Di/AA as a prefilter to improve the precision and recall of TMalign, this also makes the TMalign mode much faster.
  • We have added support for reading in Foldcomp databases (see foldcomp.foldseek.com).
  • The database module now includes an option to download ESMAtlas30.
  • We have added support for easy-cluster, a tool to cluster structural datasets using 3Di/AA alignment, LDDT, and TMscore.
  • We have added support for profile searches as well as iterative searches using the --num-iterations flag.
  • TMalign results can now be sorted by qTM, tTM, min(qTM, tTM), max(qTM, tTM), and avg(qTM, tTM) using the --sort flag.
  • New modulecompressca: converts an uncompressed Cα database to compressed format.
  • New module convert2pdb: converts a Foldseek structure database to a multi-model PDB file.
  • We added our PDB100 update pipeline to util/update_webserver_pdb

Breaking Change

  • 3Di/AA score reported by Foldseek is now bit-score * sqrt(alignment LDDT * alignment TMscore)
  • Default sort of TMalign is now average avg(qTM,tTM).
  • We do not provide the "Alphafold/UniProt-NO-CA" database anymore, Cα databases are now always required.
  • AlphaFoldDB Swiss-Prot and Proteome file names have changed. Downloads for these will stop working on Foldseek versions before this one. Generally, since the Cα database format has changed and is incompatible to older Foldseek versions. None of the v4 databases will work with previous versions.
  • The default E-value is now 10.

Bug fixes

  • We have fixed an issue that resulted in the loss of high-scoring diagonals during the prefilter step.
  • The visualization has been fixed for cases where the alignment length is exactly 80.
  • We have fixed issues with tar inputs.