Reference standardised framework for PGS methods comparison

Author

Aoife McMahon

Published

December 12, 2024

Modified

December 17, 2024

If you want to compare PGS methods we recommend following a reference standardised approach to ensure the comparison is fair. In order to truly assess which PGS development method is superior the raw ingredients and how the scores are assessed should be identical. In the context of PGS the raw ingredients are the source genetic variants associated with the phenotype in question on which the method is applied (usually population level summary statistics from a genome wide association study (GWAS)). In terms of assessment, all the target populations should be genotyped to a similar degree and the phenotypes should be similarly defined. Typically the input variants from the GWAS and the loci in the target genomes are restricted to SNPs that are commonly present in genotyping data (after imputation) for example variation within the HapMap3 reference data. This is termed a reference-standardised approach (Pain et al. 2021)

Publicly available pipelines such as GenoPred and prspipe implement reference standardised approaches (Pain, Al-Chalabi, and Lewis 2024; Monti et al. 2024).

If you wish to compare the same traits as shown here the full list of traits and GWAS summary statistics used from the GWAS Catalog are shown in Table 1 of Monti et al. (2024).

The list of HapMap3-1KG variants used to construct the polygenic scores is available here.

We encourage that the results of your evaluation (scoring files, performance metrics and metadata) are submitted to the PGS Catalog.

References

Monti, Remo, Lisa Eick, Georgi Hudjashov, Kristi Läll, Stavroula Kanoni, Brooke N Wolford, Benjamin Wingfield, et al. 2024. “Evaluation of Polygenic Scoring Methods in Five Biobanks Shows Larger Variation Between Biobanks Than Methods and Finds Benefits of Ensemble Learning.” The American Journal of Human Genetics 111 (7): 1431–47.
Pain, Oliver, Ammar Al-Chalabi, and Cathryn M Lewis. 2024. “The GenoPred Pipeline: A Comprehensive and Scalable Pipeline for Polygenic Scoring.” Bioinformatics 40 (10): btae551.
Pain, Oliver, Kylie P Glanville, Saskia P Hagenaars, Saskia Selzam, Anna E Fürtjes, Héléna A Gaspar, Jonathan RI Coleman, et al. 2021. “Evaluation of Polygenic Prediction Methodology Within a Reference-Standardized Framework.” PLoS Genetics 17 (5): e1009021.