Skip to content

Amino acid embedding and Convolutional Neural Network for HLA Class I-peptide binding prediction

License

Notifications You must be signed in to change notification settings

wby920920/HLA-bind

 
 

Repository files navigation

HLA-CNN and HLA-Vec

Author: [email protected]

Usage: python main.py config.ini

Overview: HLA-CNN tool can be used to make binding prediction on HLA Class I peptides based on convolutional neural networks and a distributed representation of amino acids, HLA-Vec. At a high level, the tool consists of (a) an unsupervised, distributed vector representation learner for raw peptide sequence, (b) a training mode to learn weights to the classifier, (c) an evaluation mode to calculate Spearman's rank correlation coefficient (SRCC) and are under the receiver operating characteristic curve (AUC), (d) an inference mode to make prediction new peptides.

Pipeline: The pipeline is specified in the config.ini file. A config file is required to specify the parameters used in the various learning algorithm as well as files and directories.

  • HLA-Vec: learns an unsupervised, distributed vector representation based on raw peptide sequence
  • train: The classifier is trained from a set of labeled data (reference Supplementary Information from doi:10.1038/srep32115)
  • evaluate: Using a labeled test set, the trained models are evaluated in terms of SRCC and AUC.
  • inference: Given a set of new peptides, predictions are inferred and scores are written out to a result file.

Notes:

  • Test files are obtained from and currently in the format given by IEDB. http://tools.iedb.org/auto_bench/mhci/weekly/
  • Although all columns in the test files are not required, the minimum ones required by the code are Allele, Measurement_type, Peptide_seq, and Measurement_value (not required if performing inference mode).

License: This project is licensed under the MIT License - see the LICENSE.md file for details.

Requirements:

  • Python 2.7.13
  • numpy 1.11.3
  • pandas 0.19.2
  • scipy 0.18.1
  • scikit_learn 0.18.1
  • gensim 2.3.0
  • keras 2.0.6
  • theano 0.9.0

Reference: Vang, Y. S. and Xie, X. (2017) HLA class I binding prediction via convolutional neural networks. https://doi.org/10.1093/bioinformatics/btx264

About

Amino acid embedding and Convolutional Neural Network for HLA Class I-peptide binding prediction

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%