Table of Contents
NEPTUNE #
Introduction #
NEPTUNE is support vector machine-based classifier for tumor homing peptides (THPs) from the probabilistic information generated by the optimal baseline models. THPs are short peptides (i.e, 3-30 Amino acids). Its used for tumor diagnosis and therapeutic applications, such as tumor site drug-delivery. NEPTUNE is stacked ensemble learning approach, which able to secure THPs prediction accuracy while compare to existing methods (i.e., THpred, SCMTHP, and MIMML). Its two-layer prediction model, which in layer one, its produce the probabilities from different ensemble models called baseline models and those subjected to meta-modeler with stacking strategy to enhance the prediction accuracy.
Dataset: #
Main training set : 490 THPs and 490 Non-THPs Small training set : 350 THPs and 350 Non-THPs Main Independent dataset : 161 THPs and 161 non-THPs Small Independent dataset : 119 THPs and 119 non-THPs
Features: #
- Amino Acid Composition (AAC)
- Di-peptide Composition (DPC)
- Amino Acid Index (AAI)
- Amphiphilic pseudo-amino acid composition (APAAC)
- Composition transition and distribution (CTD)
- Pseudo-amino acid composition (PAAC)
- Physicochemical properties (PCP)
- Reduced protein sequences (RSs) based on acidity (RSacid)
- Charge (RScharge)
- DHP (RSDHP)
- Polarity (RSpolar)
- Secondary structure (RSsecond)
Baseline Machines: #
- Random Forest (RF)
- Support Vector Machine (SVM)
- Partial Least Squares (PLS)
- Logistics regression (LR)
- Extremely randomized trees (ET)
- K-nearest neighbor (k-NN)
Metamodel Machine: #
- Support vector machine (SVM)
Features Importance #
- Shapley additive extension (SHAP)
Webserver: #
Reference : #
- Charoenkwan, P., et al., NEPTUNE: A novel computational approach for accurate and large-scale identification of tumor homing peptides. Computers in Biology and Medicine, 2022: p. 105700.