0 Datasets
0 Files
Get instant academic access to this publication’s datasets.
Yes. After verification, you can browse and download datasets at no cost. Some premium assets may require author approval.
Files are stored on encrypted storage. Access is restricted to verified users and all downloads are logged.
Yes, message the author after sign-up to request supplementary files or replication code.
Join 50,000+ researchers worldwide. Get instant access to peer-reviewed datasets, advanced analytics, and global collaboration tools.
✓ Immediate verification • ✓ Free institutional access • ✓ Global collaborationJoin our academic network to download verified datasets and collaborate with researchers worldwide.
Get Free AccessAbstract Type 2 diabetes (T2D) is a global health burden that will benefit from personalised risk prediction and targeted prevention programmes. Omics data have enabled more detailed risk prediction; however, most studies have focussed on directly on the ability of DNA variants predicting T2D onset with less attention given to epigenetic regulation and glycaemic trait variability. By applying machine learning to the longitudinal Northern Finland Birth Cohort 1966 (NFBC 1966) at 31 (T1) and 46 (T2) years old, we predicted fasting glucose (FG) and insulin (FI), glycated haemoglobin (HbA1c) and 2-hour glucose and insulin from oral glucose tolerance test (2hGlu, 2hIns) at T2 in 513 individuals from 1,001 variables at T1 and T2, including anthropometric, metabolic, metabolomic and epigenetic variables. We further tested whether the information obtained by the machine learning models in NFBC could be used to predict glycaemic traits in the independent French study with 48 matching predictors (DESIR, N=769, age range 30-65 years at recruitment, interval between data collections: 9 years). In this study, FG and FI were best predicted, with average R 2 values of 0.38 and 0.53. Sex, branched-chain and aromatic amino acids, HDL-cholesterol, glycerol, ketone bodies, blood pressure at T2 and measurements of adiposity at T1, as well as multiple methylation marks at both time points were amongst the top predictors. In the validation analysis, we reached R 2 values of 0.41/0.55 for FG/FI when trained and tested in NFBC1966 and 0.17/0.30 when trained in NFBC1966 and tested in DESIR. We identified clinically relevant sets of predictors from a large multi-omics dataset and highlighted the potential of methylation markers and longitudinal changes in prediction.
Laurie Prélot, Harmen H. M. Draisma, Mila Desi Anasanti, Zhanna Balkhiyarova, Matthias Wielscher, Loïc Yengo, Beverley Balkau, Ronan Roussel, Sylvain Sebért, Mika Ala‐Korpela, Philippe Froguel, Paul M Ridker, Marika Kaakinen, Inga Prokopenko (2018). Machine Learning in Multi-Omics Data to Assess Longitudinal Predictors of Glycaemic Health. , DOI: https://doi.org/10.1101/358390.
Datasets shared by verified academics with rich metadata and previews.
Authors choose access levels; downloads are logged for transparency.
Students and faculty get instant access after verification.
Type
Preprint
Year
2018
Authors
14
Datasets
0
Total Files
0
Language
en
DOI
https://doi.org/10.1101/358390
Access datasets from 50,000+ researchers worldwide with institutional verification.
Get Free Access