Raw Data Library
About
Aims and ScopeAdvisory Board Members
More
Who We Are?
User Guide
Green Science
​
​
Sign inGet started
​
​

About
Aims and ScopeAdvisory Board Members
More
Who We Are?
User GuideGreen Science

Sign inGet started
RDL logo

Verified research datasets. Instant access. Built for collaboration.

Navigation

About

Aims and Scope

Advisory Board Members

More

Who We Are?

Add Raw Data

User Guide

Legal

Privacy Policy

Terms of Service

Support

Got an issue? Email us directly.

Email: info@rawdatalibrary.netOpen Mail App
​
​

© 2025 Raw Data Library. All rights reserved.
PrivacyTerms
  1. Raw Data Library
  2. /
  3. Publications
  4. /
  5. An alternative approach to dimension reduction for pareto distributed data: a case study

Verified authors • Institutional access • DOI aware
50,000+ researchers120,000+ datasets90% satisfaction
Article
English
2021

An alternative approach to dimension reduction for pareto distributed data: a case study

0 Datasets

0 Files

English
2021
Journal Of Big Data
Vol 8 (1)
DOI: 10.1186/s40537-021-00428-8

Get instant academic access to this publication’s datasets.

Create free accountHow it works

Frequently asked questions

Is access really free for academics and students?

Yes. After verification, you can browse and download datasets at no cost. Some premium assets may require author approval.

How is my data protected?

Files are stored on encrypted storage. Access is restricted to verified users and all downloads are logged.

Can I request additional materials?

Yes, message the author after sign-up to request supplementary files or replication code.

Advance your research today

Join 50,000+ researchers worldwide. Get instant access to peer-reviewed datasets, advanced analytics, and global collaboration tools.

Get free academic accessLearn more
✓ Immediate verification • ✓ Free institutional access • ✓ Global collaboration
Access Research Data

Join our academic network to download verified datasets and collaborate with researchers worldwide.

Get Free Access
Institutional SSO
Secure
This PDF is not available in different languages.
No localized PDFs are currently available.
Silvia Mirri
Silvia Mirri

Institution not specified

Verified
Marco Roccetti
Giovanni Delnevo
Luca Casini
+1 more

Abstract

Deep learning models are tools for data analysis suitable for approximating (non-linear) relationships among variables for the best prediction of an outcome. While these models can be used to answer many important questions, their utility is still harshly criticized, being extremely challenging to identify which data descriptors are the most adequate to represent a given specific phenomenon of interest. With a recent experience in the development of a deep learning model designed to detect failures in mechanical water meter devices, we have learnt that a sensible deterioration of the prediction accuracy can occur if one tries to train a deep learning model by adding specific device descriptors, based on categorical data. This can happen because of an excessive increase in the dimensions of the data, with a correspondent loss of statistical significance. After several unsuccessful experiments conducted with alternative methodologies that either permit to reduce the data space dimensionality or employ more traditional machine learning algorithms, we changed the training strategy, reconsidering that categorical data, in the light of a Pareto analysis . In essence, we used those categorical descriptors, not as an input on which to train our deep learning model, but as a tool to give a new shape to the dataset, based on the Pareto rule. With this data adjustment, we trained a more performative deep learning model able to detect defective water meter devices with a prediction accuracy in the range 87–90%, even in the presence of categorical descriptors.

How to cite this publication

Marco Roccetti, Giovanni Delnevo, Luca Casini, Silvia Mirri (2021). An alternative approach to dimension reduction for pareto distributed data: a case study. Journal Of Big Data, 8(1), DOI: 10.1186/s40537-021-00428-8.

Related publications

Why join Raw Data Library?

Quality

Datasets shared by verified academics with rich metadata and previews.

Control

Authors choose access levels; downloads are logged for transparency.

Free for Academia

Students and faculty get instant access after verification.

Publication Details

Type

Article

Year

2021

Authors

4

Datasets

0

Total Files

0

Language

English

Journal

Journal Of Big Data

DOI

10.1186/s40537-021-00428-8

Join Research Community

Access datasets from 50,000+ researchers worldwide with institutional verification.

Get Free Access