Background: System toxicology aims at understanding the mechanisms used by biological systems to respond to toxicants. Such understanding can be leveraged to assess the risk of chemicals, drugs, and consumer products in living organisms. In system toxicology, machine learning techniques and methodologies are applied to develop prediction models for classification of toxicant exposure of biological systems. Gene expression data (RNA/DNA microarray) are often used to develop such prediction models. Results: The outcome of the present work is an experimental methodology to develop prediction models, based on robust gene signatures, for the classification of cigarette smoke exposure and cessation in humans. It is a result of the participation in the recent sbv IMPROVER SysTox Computational Challenge. By merging different gene selection techniques, we obtain robust gene signatures and we investigate prediction capabilities of different off-the-shelf machine learning techniques, such as artificial neural networks, linear models and support vector machines. We also predict six novel genes in our signature, and firmly believe these genes have to be further investigated as biomarkers for tobacco smoking exposure. Conclusions: The proposed methodology provides gene signatures with top-ranked performances in the prediction of the investigated classification methods, as well as new discoveries in genetic signatures for bio-markers of the smoke exposure of humans. © 2018 The Author(s).

Ensemble of rankers for efficient gene signature extraction in smoke exposure classification

Mario Rosario Guarracino
2018-01-01

Abstract

Background: System toxicology aims at understanding the mechanisms used by biological systems to respond to toxicants. Such understanding can be leveraged to assess the risk of chemicals, drugs, and consumer products in living organisms. In system toxicology, machine learning techniques and methodologies are applied to develop prediction models for classification of toxicant exposure of biological systems. Gene expression data (RNA/DNA microarray) are often used to develop such prediction models. Results: The outcome of the present work is an experimental methodology to develop prediction models, based on robust gene signatures, for the classification of cigarette smoke exposure and cessation in humans. It is a result of the participation in the recent sbv IMPROVER SysTox Computational Challenge. By merging different gene selection techniques, we obtain robust gene signatures and we investigate prediction capabilities of different off-the-shelf machine learning techniques, such as artificial neural networks, linear models and support vector machines. We also predict six novel genes in our signature, and firmly believe these genes have to be further investigated as biomarkers for tobacco smoking exposure. Conclusions: The proposed methodology provides gene signatures with top-ranked performances in the prediction of the investigated classification methods, as well as new discoveries in genetic signatures for bio-markers of the smoke exposure of humans. © 2018 The Author(s).
File in questo prodotto:
File Dimensione Formato  
2018BMCBioC.pdf

accesso aperto

Descrizione: Articolo in rivista
Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 2.12 MB
Formato Adobe PDF
2.12 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11580/84967
Citazioni
  • ???jsp.display-item.citation.pmc??? 2
  • Scopus 5
social impact