Determination of the Distribution Pattern of Mortality Using Data Mining Technique in Golestan Province since 2007 to 2009

Bagheri, Fatemeh; ,; ,

Volume 3, Issue 2 (10-2015) Jorjani Biomed J 2015, 3(2): 65-79 | Back to browse issues page

Mendeley

Zotero

RefWorks

Bagheri F. Determination of the Distribution Pattern of Mortality Using Data Mining Technique in Golestan Province since 2007 to 2009. Jorjani Biomed J 2015; 3 (2) :65-79
URL: http://goums.ac.ir/jorjanijournal/article-1-408-en.html

Determination of the Distribution Pattern of Mortality Using Data Mining Technique in Golestan Province since 2007 to 2009

Fatemeh Bagheri¹

1- Computer Engineering Department, Golestan University, Gorgan, Iran. , f.bagheri@gu.ac.ir

Abstract: (18828 Views)

Background and objectives: Investigatingg the mortality in a population has been considered as one of the appropriate methods of health detection. Although, there are some problems such as lack of confidence in accuracy measurement and quality of data collection. Establishment of death registration systems and using international classification codes of diseases, and also mortality data integrating by responsible organizations have solved great parts of the previous problems. In this study, considering a set of parameters, the study population was divided into two groups: deceased under one year (infants) and over one year (adults). Then both groups were clustered using the K-means method to identify different groups. Hidden models and useful patterns were also discovered using decision tree algorithms. Finally, a neural network algorithm was used to show the ranking of attributes in order of their importance.

Methods: In this research, data of 12,865 deceased individuals in Golestan province since 2007 to 2009 is studied. The data has been obtained from the Health Center of Golestan province. The main characteristics used in this study are: deceased age, gender, cause of death, place of residence and place of death. K-means algorithm is used to cluster data. The decision tree algorithms and neural networks algorithm were also used for classification. Finally, results and rules were extracted. Due to different natures of causes of death in infants and adults, studying on these different groups is performed separately.

Results: In clustering phase, the optimal number of clusters is obtained by Dunn index; eight clusters for infants and seven clusters for adults were obtained. Among four decision-tree algorithms (C5.0, QUEST, CHAID and CART), C5.0 algorithm with high correction rate, 77.37% in infants data and 96.86% in adults data was the best classifier algorithm. Age, gender and place of death were the most important variables that were detected by neural network algorithm.

Conclusion: In the present study, the collected mortality data was clustered by considering the effective factors and the standard of International Classification of Diseases. The hidden patterns of mortality for infants and adults were extracted. Due to the explicit nature and the intelligibility of the decision tree algorithms, the results and extracted rules are very useful for specialists in this field.

Keywords: Data Mining, Clustering, Classification, Decision Tree, Mortality

Full-Text [PDF 887 kb] (5025 Downloads)

Editorial: Original article | Subject: General medicine
Received: 2016/03/19 | Accepted: 2016/03/19 | Published: 2016/03/19