Home page for Paul McKeigue

Professor of Genetic Epidemiology and Statistical Genetics, University of Edinburgh
also Honorary Consultant in Public Health, NHS Lothian

This is my personal web page. For the Usher Institute, information about research is on this page, information about taught postgraduate courses is on this page, and information about PhDs is on this page. Larger files, including software packages and public datasets, can be found on my research group’s home page on my server.

Previous posts

  • 2004-2007

Professor of Genetic Epidemiology
Conway Institute
University College Dublin

  • 1990-2004

Senior Lecturer then Reader then Professor of Metabolic and Genetic Epidemiology
London School of Hygiene & Tropical Medicine

  • 1988-1990

British Heart Foundation Research Fellow
Department of Community Medicine
University College London Medical School

  • 1983-1988

Wellcome Training Fellow in Clinical Epidemiology
1983-84 in Department of Epidemiology, London School of Hygiene & Tropical Medicine
1984-88 in Department of Community Medicine, University College London

Contact details

Usher Institute of Population Health Sciences and Informatics
University of Edinburgh Medical School,
Teviot Place, Edinburgh EH8 9AG

Phone +44 131 650 4556

Public key encryption

If you need to send me confidential material by email, use my PGP public key (obtained by searching for my email address on a PGP key server such as https://pgp.mit.edu)
for which the fingerprint is

683E 7E3B E8B3 83BB 8F80 363A A034 3F3B B2D6 769A

For best practice, you should confirm the key fingerprint with me in person or by video link before using it.

If you need to transfer large data files securely, I can set up an SFTP account for you on my server. You will need to use SSH public key authentication. Instructions for setting this up on a Windows PC are here.

Research profile

My research focuses on methods for molecular and genetic epidemiology, with applications in clinical prediction and personalized medicine. These methods make use of Bayesian and computationally-intensive statistical methods, and machine learning methods for constructing predictors. I work closely with Helen Colhoun’s research group at the Centre for Genomic and Experimental Medicine. This collaboration includes the development of an analysis platform based on deidentified electronic health records and the use of this platform to study drug safety and complications of diabetes.

My group’s current research includes

Recent publications

  1. McKeigue P. Sample size requirements for learning to classify with high-dimensional biomarker panels. Stat Methods Med Res. 2017 Jan 1:962280217738807. doi: 10.1177/0962280217738807. [Epub ahead of print] PubMed PMID: 29179643.

  2. Pirastu N, Joshi PK, de Vries PS, Cornelis MC, McKeigue PM, Keum N, Franceschini N, Colombo M, Giovannucci EL, Spiliopoulou A, Franke L, North KE, Kraft P, Morrison AC, Esko T, Wilson JF. GWAS for male-pattern baldness identifies 71 susceptibility loci explaining 38% of the risk. Nat Commun. 2017 Nov 17;8(1):1584. doi: 10.1038/s41467-017-01490-8. PubMed PMID: 29146897; PubMed Central PMCID: PMC5691155.

  3. Bermingham ML, Colombo M, McGurnaghan SJ, Blackbourn LAK, Vučković F, Pučić Baković M, Trbojević-Akmačić I, Lauc G, Agakov F, Agakova AS, Hayward C, Klarić L, Palmer CNA, Petrie JR, Chalmers J, Collier A, Green F, Lindsay RS, Macrury S, McKnight JA, Patrick AW, Thekkepat S, Gornik O, McKeigue PM, Colhoun HM; SDRN Type 1 Bioresource Investigators.. N-Glycan Profile and Kidney Disease in Type 1 Diabetes. Diabetes Care. 2017 Nov 16. pii: dc171042. doi: 10.2337/dc17-1042. [Epub ahead of print] PubMed PMID: 29146600.

  4. Farran B, McGurnaghan S, Looker HC, Livingstone S, Lahnsteiner E, Colhoun HM, McKeigue PM. Modelling cumulative exposure for inference about drug effects in observational studies. Pharmacoepidemiol Drug Saf. 2017 Oct 12. doi:10.1002/pds.4327. [Epub ahead of print] PubMed PMID: 29024286.

  5. Bell S, Farran B, McGurnaghan S, McCrimmon RJ, Leese GP, Petrie JR, McKeigue P, Sattar N, Wild S, McKnight J, Lindsay R, Colhoun HM, Looker H. Risk of acute kidney injury and survival in patients treated with Metformin: an observational cohort study.BMC Nephrol. 2017 May 19;18(1):163. doi: 10.1186/s12882-017-0579-5. PubMed PMID: 28526011; PubMed Central PMCID: PMC5437411.

  6. Spiliopoulou A, Colombo M, Orchard P, Agakov F, McKeigue P. GeneImp: Fast Imputation to Large Reference Panels Using Genotype Likelihoods from Ultralow Coverage Sequencing. Genetics. 2017 May;206(1):91-104. doi: 10.1534/genetics.117.200063. Epub 2017 Mar 27. PubMed PMID: 28348060; PubMed Central PMCID: PMC5419496.

  7. Quell JD, Römisch-Margl W, Colombo M, Krumsiek J, Evans AM, Mohney R, Salomaa V, de Faire U, Groop LC, Agakov F, Looker HC, McKeigue P, Colhoun HM, Kastenmüller G. Automated pathway and reaction prediction facilitates in silico identification of unknown metabolites in human cohort studies. J Chromatogr B Analyt Technol Biomed Life Sci. 2017 Apr 4. pii: S1570-0232(17)30568-8. doi: 10.1016/j.jchromb.2017.04.002. [Epub ahead of print] PubMed PMID: 28479069.

  8. Sandholm N, Van Zuydam N, Ahlqvist E, Juliusdottir T, Deshmukh HA, Rayner NW, Di Camillo B, Forsblom C, Fadista J, Ziemek D, Salem RM, Hiraki LT, Pezzolesi M, Trégouët D, Dahlström E, Valo E, Oskolkov N, Ladenvall C, Marcovecchio ML, Cooper J, Sambo F, Malovini A, Manfrini M, McKnight AJ, Lajer M, Harjutsalo V, Gordin D, Parkkonen M; FinnDiane Study Group, Jaakko Tuomilehto., Lyssenko V, McKeigue PM, Rich SS, Brosnan MJ, Fauman E, Bellazzi R, Rossing P, Hadjadj S, Krolewski A, Paterson AD; DCCT/EDIC Study Group, Jose C. Florez., Hirschhorn JN, Maxwell AP; GENIE Consortium, David Dunger., Cobelli C, Colhoun HM, Groop L, McCarthy MI, Groop PH; SUMMIT Consortium.. The Genetic Landscape of Renal Complications in Type 1 Diabetes.J Am Soc Nephrol. 2017 Feb;28(2):557-574. doi: 10.1681/ASN.2016020231. Epub 2016 Sep 19. PubMed PMID: 27647854; PubMed Central PMCID: PMC5280020.

  9. Postmus I, Warren HR, Trompet S, Arsenault BJ, Avery CL, Bis JC, Chasman DI, de Keyser CE, Deshmukh HA, Evans DS, Feng Q, Li X, Smit RA, Smith AV, Sun F, Taylor KD, Arnold AM, Barnes MR, Barratt BJ, Betteridge J, Boekholdt SM, Boerwinkle E, Buckley BM, Chen YI, de Craen AJ, Cummings SR, Denny JC, Dubé MP, Durrington PN, Eiriksdottir G, Ford I, Guo X, Harris TB, Heckbert SR, Hofman A, Hovingh GK, Kastelein JJ, Launer LJ, Liu CT, Liu Y, Lumley T, McKeigue PM, Munroe PB, Neil A, Nickerson DA, Nyberg F, O'Brien E, O'Donnell CJ, Post W, Poulter N, Vasan RS, Rice K, Rich SS, Rivadeneira F, Sattar N, Sever P, Shaw-Hawkins S, Shields DC, Slagboom PE, Smith NL, Smith JD, Sotoodehnia N, Stanton A, Stott DJ, Stricker BH, Stürmer T, Uitterlinden AG, Wei WQ, Westendorp RG, Whitsel EA, Wiggins KL, Wilke RA, Ballantyne CM, Colhoun HM, Cupples LA, Franco OH, Gudnason V, Hitman G, Palmer CN, Psaty BM, Ridker PM, Stafford JM, Stein CM, Tardif JC, Caulfield MJ, Jukema JW, Rotter JI, Krauss RM. Meta-analysis of genome-wide association studies of HDL cholesterol response to statins. J Med Genet. 2016 Dec;53(12):835-845. doi: 10.1136/jmedgenet-2016-103966. Epub 2016 Sep 1. PubMed PMID: 27587472; PubMed Central PMCID: PMC5309131.

  10. Scotland G, McKeigue P, Philip S, Leese GP, Olson JA, Looker HC, Colhoun HM, Javanbakht M. Modelling the cost-effectiveness of adopting risk-stratified approaches to extended screening intervals in the national diabetic retinopathy screening programme in Scotland. Diabet Med. 2016 Jul;33(7):886-95. doi: 10.1111/dme.13129. Epub 2016 May 11. PubMed PMID: 27040994.


Sample size requirements for learning to classify with high-dimensional biomarker panels (Statistical Methods for Medical Research 2017, final version now online)

This paper describes a simple method for calculating the sample size required to learn to classify with a high-dimensional biomarker panel, based on the asymptotic distribution of the log Bayes factor

This R script uses the method described in the paper to calculate and plot a learning curve for a classifier as a function of the ratio of cases to biomarkers. To use it, you have to specify the performance (as C-statistic or AUROC) of the optimal classifier that could be learned from a training sample of infinite size, and the proportion of biomarkers that have nonzero effect sizes.


Useful links

Tutorials, teaching notes and slide presentations

Genetic epidemiology

Statistical methods for molecular epidemiology, precision medicine and related areas

Tutorials on using the statistical modelling program Stan in molecular epidemiology

Other tutorials and teaching notes

Recent slide presentations

Methods for protection of privacy in the 2015 Charter for Safe Havens in Scotland for handling unconsented data from NHS patient records: a critical look

Stratified medicine as a statistical modelling problem: learning finite mixture models for disease subtyping

Using GWAS summary statistics to construct polygenic scores for hypothesis testing and prediction