Identification of significant features and data mining techniques in predicting heart disease

Amin, Mohammad Shafenoor and Chiam, Yin Kia and Varathan, Kasturi Dewi (2019) Identification of significant features and data mining techniques in predicting heart disease. Telematics and Informatics, 36. pp. 82-93. ISSN 0736-5853, DOI

Full text not available from this repository.
Official URL:


Cardiovascular disease is one of the biggest cause for morbidity and mortality among the population of the world. Prediction of cardiovascular disease is regarded as one of the most important subject in the section of clinical data analysis. The amount of data in the healthcare industry is huge. Data mining turns the large collection of raw healthcare data into information that can help to make informed decision and prediction. There are some existing studies that applied data mining techniques in heart disease prediction. Nonetheless, studies that have given attention towards the significant features that play a vital role in predicting cardiovascular disease are limited. It is crucial to select the correct combination of significant features that can improve the performance of the prediction models. This research aims to identify significant features and data mining techniques that can improve the accuracy of predicting cardiovascular disease. Prediction models were developed using different combination of features, and seven classification techniques: k-NN, Decision Tree, Naive Bayes, Logistic Regression (LR), Support Vector Machine (SVM), Neural Network and Vote (a hybrid technique with Naïve Bayes and Logistic Regression). Experiment results show that the heart disease prediction model developed using the identified significant features and the best-performing data mining technique (i.e. Vote) achieves an accuracy of 87.4% in heart disease prediction.

Item Type: Article
Funders: University of Malaya Research Grant (UMRG), Project Code: RP028C-14HTM, Ministry of Education Malaysia (Higher Education)’s Fundamental Research Grant Scheme (FRGS), Project Code: FP057-2017A
Uncontrolled Keywords: Data mining; Prediction model; Classification algorithms; Feature selection; Heart disease prediction
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Divisions: Faculty of Computer Science & Information Technology
Depositing User: Ms. Juhaida Abd Rahim
Date Deposited: 23 Jan 2019 04:34
Last Modified: 23 Jan 2019 04:34

Actions (login required)

View Item View Item