Skip to content

PepFun2 #

Find similar titles
  • 최초 작성자

Introduction #

Transform the text to numerical data set is crucial to execute the machine learning. Here, the biomolecules such as DNA, RNA and protein sequences need to be transformed to numerical features, which can be take forward to machine learning predictions. Here, PepFun2 is one of recent method which used to calculate the mathematical descriptors for any given peptide sequences with or without natural modified peptides.

Features #

1) From amino acid sequence

a. Alignments (global and local)

b. Prediction of properties

c. Empirical rules

d. Peptide library analysis 2) Confomer

a. Prediction of conformer (basic)

b. Assignment of secondary structure restraints 3) Interactions

a. Detection of hydrogen bonds (with graph visuals)

b. Non-bonded contacts with thresholds 4) Modifications

a. Filling/adding missing residues

b. Adding capping groups

c. Mutation of NNAAs

5) Sequence with Non-natural AAs

a. Alignments with Non-natural AAs

b. SMILES to AAs (for natural sequences)

Proposed Set of rules to select for experiments #

  1. Warning if the number of charged and/or of hydrophobic amino acids exceeds 45%.

  2. Warning if the absolute total peptide charge at pH 7 is more than +1.

  3. Warning if the number of glycines or prolines is more than one in the sequence.

  4. Warning if the first or the last amino acid is charged.

  5. Warning if any amino acid represents more than 25% of the total sequence.

  6. Warning if two prolines are consecutive.

  7. Warning if the motifs DG (aspartic acid and glycine) and DP (aspartic acid and proline) are present in the sequence. Two rules, one per motif.

  8. Warning if the sequences ends with aparagine (N) or glutamine (Q) residues.

  9. Warning if there are charged residues every five amino acids.

  10. Warning if there are oxidation-sensitive amino acids like methionine (M), cysteine (C) or tryptophan (W). Three rules, one per amino acid.

Availability #

• Project name: PepFun (version 2.0)

• Project home page:

• Operating system(s): Linux

• Programming language: Python 3

• Other requirements: RDKit 2020 or higher; Biopython 1.7.9 or higher; Modeller 10.3 or higher. • License: MIT

Execution: #

The complete list of python scripts are present in with detail documentation.

Reference: #

PepFun: Open Source Protocols for Peptide-Related Computational Analysis', Molecules, 2021. Link: