Rough Set Theory for Dimension Reduction On Machine Learning Algorithm
Keywords:Core and Reduct, Dimension reduction, Machine Learning, Machine Learning Metrics, Rough Set Theory
Dimension reduction is a method applied in machine learning sector to significantly improve the efficiency of computational process. The application of high number variables in certain dataset is expected to be able to provide more information to analyze. However, this application of high number of variables will impacted on the computational time and weight linearly. Dimension reduction method serves to transforming the high dimension data into much lower dimension without significantly reduce the initial information and characteristic provided by the initial data. Core and Reduct is a method acquired through the concept of Rough Set. Dataset functioning as the input and output on Machine Learning can be perceived as informational system. The objective of this research is to determine the impact of the dimension reduction application on machine learning algorithm on the reduction of computational time and weight. Core and Reduct will be applied in few popular machine learning method such as Support Vector Machine (SVM), Logistic Regression, and K-Nearest Neighbors (KNN). This research applied on 5 UCI machine learning dataset which are Iris, Seeds, Years, Sonar, and Hill-Valley. Furthermore, Machine learning metrics such as Accuracy, Recall, Precision, and F1-Score also observed and compared. This research resulted in the conclusion that Core and Reduct is able to decrease the computational time up to 80% and maintain the value of each evaluation model.
A. Prajana, F. Sains, T. Universitas, I. Negeri, A. Raniry, and B. Aceh, “Penerapan Teory Rough Set Untuk Memprediksi Tingkat Kelulusan Siswa Dalam Ujian Nasional Pada Sma Negeri 5 Kota Banda Aceh,” J. Islam. Sci. Technol., vol. 2, no. 1, pp. 75–88, 2016, [Online]. Available: www.jurnal.ar-raniry.com/index.php/elkawnie.
B. Walczak and D. L. Massart, “Rough sets theory,” Chemom. Intell. Lab. Syst., 1999, doi: 10.1016/S0169-7439(98)00200-7.
C. Wu, Y. Yue, M. Li, and O. Adjei, “The rough set theory and applications,” Engineering Computations (Swansea, Wales). 2004, doi: 10.1108/02644400410545092.
G. Qi, Z. Zhu, K. Erqinhu, Y. Chen, Y. Chai, and J. Sun, “Fault-diagnosis for reciprocating compressors using big data and machine learning,” Simul. Model. Pract. Theory, vol. 80, pp. 104–127, 2018, doi: 10.1016/j.simpat.2017.10.005.
G. Suwardika, “Pengelompokan Dan Klasifikasi Pada Data Hepatitis Dengan Menggunakan Support Vector Machine (SVM), Classification And Regression Tree (Cart) Dan Regresi Logistik Biner,” J. Educ. Res. Eval., vol. 1, no. 3, p. 183, 2017, doi: 10.23887/jere.v1i3.12016.
I. De Feis, “Dimensionality reduction,” in Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics, 2018.
I. Ketut, P. Suniantara, and M. Rusli, “Klasifikasi Waktu Kelulusan Mahasiswa Stikom Bali Menggunakan Chaid Regression – Trees dan Regresi Logistik Biner,” Statistika, vol. 5, no. 1, pp. 27–32, 2017.
J. Qiu, Q. Wu, G. Ding, Y. Xu, and S. Feng, “A survey of machine learning for big data processing,” EURASIP J. Adv. Signal Process., vol. 2016, no. 1, 2016, doi: 10.1186/s13634-016-0355-x.
L. Lei, W. Chen, B. Wu, C. Chen, and W. Liu, “A building energy consumption prediction model based on rough set theory and deep learning algorithms,” Energy Build., vol. 240, p. 110886, 2021, doi: 10.1016/j.enbuild.2021.110886.
L. Zhou, S. Pan, J. Wang, and A. V. Vasilakos, “Machine learning on big data: Opportunities and challenges,” Neurocomputing, vol. 237, pp. 350–361, 2017, doi: 10.1016/j.neucom.2017.01.026.
M. Mohammadi and A. Al-Fuqaha, “Enabling cognitive smart cities using big data and machine learning: Approaches and challenges,” arXiv, no. February, pp. 94–101, 2018.
M. S. Raza and U. Qamar, “An incremental dependency calculation technique for feature selection using rough sets,” Inf. Sci. (Ny)., 2016, doi: 10.1016/j.ins.2016.01.044.
O. F.Y, A. J.E.T, A. O, H. J. O, O. O, and A. J, “Supervised Machine Learning Algorithms: Classification and Comparison,” Int. J. Comput. Trends Technol., vol. 48, no. 3, pp. 128–138, 2017, doi: 10.14445/22312803/ijctt-v48p126.
S. A. Naufal, A. Adiwijaya, and W. Astuti, “Analisis Perbandingan Klasifikasi Support Vector Machine (SVM) dan K-Nearest Neighbors (KNN) untuk Deteksi Kanker dengan Data Microarray,” JURIKOM (Jurnal Ris. Komputer), vol. 7, no. 1, p. 162, 2020, doi: 10.30865/jurikom.v7i1.2014.
S. Zhang et al., “IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 1 Efficient kNN Classification With Different Numbers of Nearest Neighbors,” Ieee Trans. Neural Networks Learn. Syst., pp. 1–12, 2017, [Online]. Available: http://ieeexplore.ieee.org.
W. K. Ching, M. K. Ng, and E. S. Fung, “Higher-order multivariate Markov chains and their applications,” Linear Algebra Appl., 2008, doi: 10.1016/j.laa.2007.05.021.
How to Cite
Copyright (c) 2021 Rani Nuraeni, Sugiyarto Surono
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.