Stacking and chaining of normalization methods in deep learning-based classification of colorectal cancer using gut microbiome data

Mulenga, Mwenge and Kareem, Sameem Abdul and Sabri, Aznul Qalid Md and Seera, Manjeevan (2021) Stacking and chaining of normalization methods in deep learning-based classification of colorectal cancer using gut microbiome data. IEEE Access, 9. pp. 97296-97319. ISSN 2169-3536, DOI

Full text not available from this repository.


Machine learning (ML)-based detection of diseases using sequence-based gut microbiome data has been of great interest within the artificial intelligence in medicine (AIM) community. The approach offers a non-invasive alternative for colorectal cancer detection, which is based on stool samples. Considering limitations of existing methods in CRC detection, medical research has shown interest in the use of high throughput data to identify the disease. Owing to several limitations of conventional ML algorithms, deep learning (DL) methods are becoming more popular due to their outstanding performance in related fields. However, the performance of DL methods is affected by limitations such as dimensionality, sparsity, and feature dominance inherent in microbiome data. This research proposes stacking and chaining of normalization methods to address the limitations. While the stacking technique offers a robust, easy to use, and interpretable alternative for augmenting microbiome and other tabular data, the chaining technique is an alternative to data normalization that dynamically adjusts the underlying properties of data towards the normal distribution. The proposed techniques are combined with rank transformation and feature selection to further improve the performance of the model, with area under the curve (AUC) values between 0.857 to 0.987 using publicly available datasets.

Item Type: Article
Funders: Malaysia's Ministry of Higher Education through the Research Grant by the University of Malaya, under the Trans-Discipline Research Grant Scheme (TR001D-2018A)
Uncontrolled Keywords: Data models; Feature extraction; Stacking; Cancer; Prediction algorithms; Classification algorithms; Sensitivity; Deep neural network; Colorectal cancer; Microbiome; Normalization; Augmentation; Stacking; Chaining
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
T Technology > TA Engineering (General). Civil engineering (General)
Divisions: Faculty of Computer Science & Information Technology
Depositing User: Ms. Juhaida Abd Rahim
Date Deposited: 13 Apr 2022 06:51
Last Modified: 13 Apr 2022 06:51

Actions (login required)

View Item View Item