Abstract
In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before.
Original language | English |
---|---|
Article number | 201900429 |
Journal | Life Science Alliance |
Volume | 2 |
Issue number | 5 |
Number of pages | 14 |
ISSN | 2575-1077 |
DOIs | |
Publication status | Published - 1 Jan 2019 |
Access to Document
FulltextFinal published version, 3.08 MB
OpenUrl availability
Fingerprint
Dive into the research topics of 'Detecting sequence signals in targeting peptides using deep learning'. Together they form a unique fingerprint.
View full fingerprint
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver
Armenteros, J. J. A., Salvatore, M., Emanuelsson, O., Winther, O., Von Heijne, G., Elofsson, A. (2019). Detecting sequence signals in targeting peptides using deep learning. Life Science Alliance, 2(5), Article 201900429. https://doi.org/10.26508/lsa.201900429
Armenteros, Jose Juan Almagro ; Salvatore, Marco ; Emanuelsson, Olof et al. / Detecting sequence signals in targeting peptides using deep learning. In: Life Science Alliance. 2019 ; Vol. 2, No. 5.
@article{909568c1cf73473597b9c066b6b4cee0,
title = "Detecting sequence signals in targeting peptides using deep learning",
abstract = "In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before.",
author = "Armenteros, {Jose Juan Almagro} and Marco Salvatore and Olof Emanuelsson and Ole Winther and {Von Heijne}, Gunnar and Arne Elofsson and Henrik Nielsen",
year = "2019",
month = jan,
day = "1",
doi = "10.26508/lsa.201900429",
language = "English",
volume = "2",
journal = "Life Science Alliance",
issn = "2575-1077",
publisher = "Life Science Alliance",
number = "5",
}
Armenteros, JJA, Salvatore, M, Emanuelsson, O, Winther, O, Von Heijne, G, Elofsson, A 2019, 'Detecting sequence signals in targeting peptides using deep learning', Life Science Alliance, vol. 2, no. 5, 201900429. https://doi.org/10.26508/lsa.201900429
Detecting sequence signals in targeting peptides using deep learning. / Armenteros, Jose Juan Almagro; Salvatore, Marco; Emanuelsson, Olof et al.
In: Life Science Alliance, Vol. 2, No. 5, 201900429, 01.01.2019.
Research output: Contribution to journal › Journal article › Research › peer-review
TY - JOUR
T1 - Detecting sequence signals in targeting peptides using deep learning
AU - Armenteros, Jose Juan Almagro
AU - Salvatore, Marco
AU - Emanuelsson, Olof
AU - Winther, Ole
AU - Von Heijne, Gunnar
AU - Elofsson, Arne
AU - Nielsen, Henrik
PY - 2019/1/1
Y1 - 2019/1/1
N2 - In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before.
AB - In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before.
U2 - 10.26508/lsa.201900429
DO - 10.26508/lsa.201900429
M3 - Journal article
C2 - 31570514
AN - SCOPUS:85072779066
SN - 2575-1077
VL - 2
JO - Life Science Alliance
JF - Life Science Alliance
IS - 5
M1 - 201900429
ER -
Armenteros JJA, Salvatore M, Emanuelsson O, Winther O, Von Heijne G, Elofsson A et al. Detecting sequence signals in targeting peptides using deep learning. Life Science Alliance. 2019 Jan 1;2(5):201900429. doi: 10.26508/lsa.201900429