K-MEANS SEBAGAI EKSTRAKTOR CIRI PADA KLASIFIKASI DATA DENGAN ALGORITMA SUPPORT VECTOR MACHINE (SVM)

Nurul Chamidah

doi:10.24176/simet.v9i2.2433

K-MEANS SEBAGAI EKSTRAKTOR CIRI PADA KLASIFIKASI DATA DENGAN ALGORITMA SUPPORT VECTOR MACHINE (SVM)

Nurul Chamidah

Abstract

Besarnya dimensi pada ciri merupakan masalah pada komputasi untuk mengklasifikasi data sehingga diperlukan suatu proses ekstraksi ciri agar dimensinya berkurang dengan cara mengambil hanya informasi yang penting dari ciri. Penelitian ini menggunakan algoritma K-Means untuk mengekstraksi ciri dengan menemukan pola tersembunyi dari setiap kelas kemudian direkonstruksi dengan fuzzy membership function dan mendapatkan pola baru. Pola baru yang terbentuk digunakan sebagai ciri abstrak dan dibagi kedalam data latih dan data uji. Pelatihan dilakukan dengan memanfaatkan algoritma Support Vector Machine (SVM) untuk mendapatkan model klasifikasi. Model klasifikasi SVM yang diperoleh kemudian di uji dengan menggunakan data uji untuk memperoleh performa klasifikasi berupa akurasi dan waktu komputasi. Dengan 5-fold cross validation, metode ini memberikan akurasi yang baik pada dataset Liver, Breast Cancer dan Heart Disease yang diperoleh dari UCI Machine Learning Repository. Penelitian ini menunjukkan kemampuan K-Means untuk mengekstraksi ciri dari dataset. Hasil penelitian ini menujukkan bahwa K-Means sebagai ekstraktor ciri dapat mengurangi waktu komputasi.

Keywords

ekstraksi; ciri; klasifikasi; k-means; SVM, akurasi

Teks Lengkap:

PDF

Referensi

A. Janecek, W. N. W. Gansterer, M. Demel, and G. Ecker, “On the Relationship Between Feature Selection and Classification Accuracy.,” Fsdm, vol. 4, pp. 90–105, 2008.

C. Lee and D. Landgrebe, “FEATURE EXTRACTION AND CLASSIFICATION ALGORITHMS FOR HIGH DIMENSIONAL DATA,” 1993.

J. Pohjalainen, O. Räsänen, and S. Kadioglu, “Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits,” Comput. Speech Lang., vol. 29, no. 1, pp. 145–171, Jan. 2015.

H. Cen, R. Lu, Q. Zhu, and F. Mendoza, “Nondestructive detection of chilling injury in cucumber fruit using hyperspectral imaging with feature selection and supervised classification,” Postharvest Biol. Technol., vol. 111, pp. 352–361, Jan. 2016.

A. T. Azar and A. E. Hassanien, “Dimensionality reduction of medical big data using neural-fuzzy classifier,” Soft Comput., vol. 19, no. 4, pp. 1115–1127, Apr. 2015.

Z. M. Hira and D. F. Gillies, “A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data.,” Adv. Bioinformatics, vol. 2015, p. 198363, Jun. 2015.

M. Ahmadi, D. Ulyanov, S. Semenov, M. Trofimov, and G. Giacinto, “Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification,” in Proceedings of the Sixth ACM on Conference on Data and Application Security and Privacy - CODASPY ’16, 2016, pp. 183–194.

B. Zheng, S. W. Yoon, and S. S. Lam, “Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms,” Expert Syst. Appl., vol. 41, no. 4, pp. 1476–1482, Mar. 2014.

N. Chamidah and I. Wasito, “Fetal state classification from cardiotocography based on feature extraction using hybrid K-Means and support vector machine,” in 2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS), 2015, pp. 37–41.

D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization,” IEEE Trans. Evol. Comput., vol. 1, no. 1, pp. 67–82, Apr. 1997.

D. Dheeru and E. K. Taniskidou, “UCI machine learning repository,” University of California, Irvine, School of Information and Computer Sciences, 2017. .

N. Chamidah, Wiharto, and U. Salamah, “Pengaruh Normalisasi Data pada Jaringan Syaraf Tiruan Backpropagasi Gradient Descent Adaptive Gain (BPGDAG) untuk Klasifikasi,” ITSMART J. Teknol. dan Inf., vol. 1, no. 1, pp. 28–33, Sep. 2012.

S. Lukasik, P. A. Kowalski, M. Charytanowicz, and P. Kulczycki, “Clustering using flower pollination algorithm and Calinski-Harabasz index,” 2016 IEEE Congr. Evol. Comput. CEC 2016, no. 1, pp. 2724–2728, 2016.

B. Halpin, Halpin, and Brendan, “CALINSKI: Stata module to compute Calinski-Harabasz cluster stopping index from distance matrix,” Jun. 2016.

T. Caliñski and J. Harabasz, “A Dendrite Method Foe Cluster Analysis,” Commun. Stat., vol. 3, no. 1, pp. 1–27, 1974.

DOI: https://doi.org/10.24176/simet.v9i2.2433