A Comparative Study of Multilayer Neural Network and C4.5 Decision Tree Models for Predicting the Risk of Breast Cancer

Soolmaz Sohrabi (1), Alireza Atashi (2), Ali Dadashi (3), Sina Marashi (4)
(1) Shahid Beheshti University of Medical Sciences, Department of Medical Informatics, Tehran, Iran, Iran, Islamic Republic of,
(2) Department of E-Health, Virtual School, Tehran University of Medical Sciences, Tehran, Iran AND Cancer Informatics Department, Breast Cancer Research Center, ACECR, Iran, Iran, Islamic Republic of,
(3) Mashhad University of Medical Sciences, Department Of Medical Informatics, Mashhad, Iran, Iran, Islamic Republic of,
(4) Department of E-Health, Virtual School, Tehran University of Medical Sciences, Tehran, Iran, Iran, Islamic Republic of

Abstract

Background: Diagnosing breast cancer at an early stage can have a great impact on cancer mortality. One of the fundamental problems in cancer treatment is the lack of a proper method for early detection, which may lead to diagnostic errors. Using data analysis techniques can significantly help in early diagnosis of the disease. The purpose of this study was to evaluate and compare the efficacy of two data mining techniques, i.e., multilayer neural network and C4.5, in early diagnosis of breast cancer.
Methods: A data set from Motamed Cancer Institute's breast cancer research clinic, Tehran, containing 2860 records related to breast cancer risk factors were used. Of the records, 1141 (40%) were related to malignant changes and breast cancer and 1719 (60%) to benign tumors. The data set was analyzed using perceptron neural network and decision tree algorithms, and was split into two a training data set (70%) and a testing data set (30%) using Rapid Miner 5.2.
Results: For neural networks, accuracy was 80.52%, precision 88.91%, and sensitivity 90.88%; and for decision tree, accuracy was 80.98%, precision 80.97%, and sensitivity 89.32%. Results indicated that both algorithms have acceptable capabilities for analyzing breast cancer data.
Conclusion: Although both models provided good results, neural network showed more reliable diagnosis for positive cases. Data set type and analysis method affect results. On the other hand, information about more powerful risk factors of breast cancer, such as genetic mutations, can provide models with high coverage.

Full text article

Generated from XML file

Authors

Soolmaz Sohrabi
Alireza Atashi
smatashi@yahoo.com (Primary Contact)
Ali Dadashi
Sina Marashi
1.
Sohrabi S, Atashi A, Dadashi A, Marashi S. A Comparative Study of Multilayer Neural Network and C4.5 Decision Tree Models for Predicting the Risk of Breast Cancer. Arch Breast Cancer [Internet]. 2018 Mar. 29 [cited 2024 Jul. 27];5(1):11-4. Available from: https://archbreastcancer.com/index.php/abc/article/view/141

Article Details