0 Datasets
0 Files
Get instant academic access to this publication’s datasets.
Yes. After verification, you can browse and download datasets at no cost. Some premium assets may require author approval.
Files are stored on encrypted storage. Access is restricted to verified users and all downloads are logged.
Yes, message the author after sign-up to request supplementary files or replication code.
Join 50,000+ researchers worldwide. Get instant access to peer-reviewed datasets, advanced analytics, and global collaboration tools.
✓ Immediate verification • ✓ Free institutional access • ✓ Global collaborationJoin our academic network to download verified datasets and collaborate with researchers worldwide.
Get Free AccessThis paper introduces a methodology aimed at validating anomalies identified through unsupervised techniques. Our approach is grounded in the assumption that machine learning models perform optimally when trained on data free of anomalies. Therefore, to assess the veracity of anomalies pinpointed by an unsupervised method, we undertake a two-fold process. Initially, anomalies are removed from the training dataset. Subsequently, we gauge the expected enhancement in the performance of classification models. To evaluate the effectiveness of our methodology, we employed three well-established unsupervised anomaly detection techniques: Local Outlier Factor (LOF), Isolation Forest (iForest), and Autoencoders. These methods were complemented by a voting system designed to identify anomalous data records. The reliability of these detected anomalies was rigorously tested using various classification models, including K-Nearest Neighbors, Logistic Regression, Decision Tree, Random Forest, AdaBoost, and Support Vector Classifier (SVC). This evaluation was conducted both before and after the removal of anomalies from the training dataset. Our methodology underwent rigorous testing across five distinct datasets: Breast Cancer, German Credit, Diabetes, Heart Failure Disease, and Titanic Survivor. The results provide evidence of its effectiveness, with notable improvements observed in the classification performance across four of the five datasets.
Diego-Ernesto Herrera-Malambo, Andy Domínguez-Monterroza, Alberto Patino Vanegas (2023). A proposed method to validate anomalies detected with unsupervised models. , pp. 1-6, DOI: 10.1109/c358072.2023.10436300.
Datasets shared by verified academics with rich metadata and previews.
Authors choose access levels; downloads are logged for transparency.
Students and faculty get instant access after verification.
Type
Article
Year
2023
Authors
3
Datasets
0
Total Files
0
Language
English
DOI
10.1109/c358072.2023.10436300
Access datasets from 50,000+ researchers worldwide with institutional verification.
Get Free Access