PERFORMANCE VALIDATION OF PRIOR QUANTIZATION TECHNIQUES IN OUTLIERS CLASSIFICATION USING WDBC DATASET
DOI:
https://doi.org/10.29121/ijetmr.v5.i4.2018.207Keywords:
Data Mining, Classification, Outlier Detection, Feature Selection, Wisconsin Diagnosis Breast Cancer (WDBC)Abstract
Data mining is the process of analyzing enormous data and summarizing it into the useful knowledge discovery and the task of data mining approaches is growing quickly, particularly classification techniques very efficient, way to classifying the data, which is important in the decision-making process for medical practitioners. This study presents the quantization and validation (OQV) techniques for fast outlier detection in large size WDBC data sets. The distance metrics utilization makes the algorithm as the linear one for various objects and assures the sequential scanning. The inclusion of direct quantization technique and the cluster explicit discovery assures the simplicity and the economical. The comparative analysis of proposed OQV techniques with the triangular boundary-based classification and the Weighing-based Feature Selection and Monotonic Classification (WFSMC) regarding the accuracy, precision, recall and the number of attributes assures an effectiveness of OQV for large size datasets.
Downloads
References
C. C. Aggarwal, "Supervised outlier detection," in Outlier Analysis, ed: Springer, 2013, pp. 169- 198. DOI: https://doi.org/10.1007/978-1-4614-6396-2_6
K. Noto, C. Brodley, and D. Slonim, "FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection," Data mining and knowledge discovery, vol. 25, pp. 109-133, 2012. DOI: https://doi.org/10.1007/s10618-011-0234-x
A. Daneshpazhouh and A. Sami, "Entropy-based outlier detection using semi-supervised approach with few positive examples," Pattern Recognition Letters, vol. 49, pp. 77-84, 2014. DOI: https://doi.org/10.1016/j.patrec.2014.06.012
M. Radovanovic, A. Nanopoulos, and M. Ivanovic, "Reverse nearest neighbors in unsupervised distance-based outlier detection," IEEE Transactions on Knowledge and Data Engineering, vol. 27, pp. 1369-1382, 2015.
F. Angiulli, S. Basta, S. Lodi, and C. Sartori, "Distributed strategies for mining outliers in large data sets," IEEE Transactions on Knowledge and Data Engineering, vol. 25, pp. 1520-1532, 2013.
N. Pham and R. Pagh, "A near-linear time approximation algorithm for angle-based outlier detection in high-dimensional data," in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, 2012, pp. 877-885. DOI: https://doi.org/10.1145/2339530.2339669
N. Sharma and H. Om, "Usage of Probabilistic and General Regression Neural Network for Early Detection and Prevention of Oral Cancer," The Scientific World Journal, vol. 2015, 2015. DOI: https://doi.org/10.1155/2015/234191
S. G. Jacob and R. G. Ramani, "Discovery of Knowledge Patterns in Clinical Data through Data Mining Algorithms: Multi-class Categorization of Breast Tissue Data," International Journal of Computer Applications (IJCA), vol. 32, pp. 46-53, 2011.
B. Kumari and T. Swarnkar, "Filter versus wrapper feature subset selection in large dimensionality microarray: A review," 2011.
Downloads
Published
How to Cite
Issue
Section
License
License and Copyright Agreement
In submitting the manuscript to the journal, the authors certify that:
- They are authorized by their co-authors to enter into these arrangements.
- The work described has not been formally published before, except in the form of an abstract or as part of a published lecture, review, thesis, or overlay journal.
- That it is not under consideration for publication elsewhere.
- That its release has been approved by all the author(s) and by the responsible authorities – tacitly or explicitly – of the institutes where the work has been carried out.
- They secure the right to reproduce any material that has already been published or copyrighted elsewhere.
- They agree to the following license and copyright agreement.
Copyright
Authors who publish with International Journal of Engineering Technologies and Management Research agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors can enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or edit it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) before and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
For More info, please visit CopyRight Section