A platform for Mendelian randomisation using summary data from genome-wide association studies

Current status

Beta phase release

App version:
1.4.3 8a77eb (25 October 2020)

R version:
4.0.3

Host:
d81b2baaf993

R/TwoSampleMR version:
0.5.5

Database version:
0.3.0 (25 October 2020)

Before beginning analysis in the web application please do review the 'Data access agreement' in the sidebar.

25 October 2020 - Major updates - see Changelog

Information

This web-app represents relatively limited analytical scope compared to using the TwoSampleMR R package directly, which also enables analysis of your own outcome data:
https://mrcieu.github.io/TwoSampleMR/

See LD Hub for automated LD score regression:
http://ldsc.broadinstitute.org/

See EpiGraphDB for pre-calculated MR results and many other epidemiological datasets:
https://www.epigraphdb.org/

Data underlying this web-app are hosted by the OpenGWAS project:
https://gwas.mrcieu.ac.uk

The data is contributed by the international GWAS community - please see Acknowledgements and cite studies accordingly!

Contact

On going discussions for providing feedback, seeking help, and requesting features are hosted at the GitHub issues page:

https://github.com/MRCIEU/TwoSampleMR/issues

For enquiries please contact: mr-base@bristol.ac.uk

Suggest new studies

If there are GWAS summary datasets you would like to see included in MR-Base, please could you enter the relevant details at the link below (but make sure the data is not already included in MR-Base).

Click here to suggest new studies

Credits

If you use this tool please cite:

The MR-Base platform supports systematic causal inference across the human phenome.
eLife 2018;7:e34408. doi: 10.7554/eLife.34408.
Gibran Hemani, Jie Zheng, Benjamin Elsworth, Kaitlin H Wade, Valeriia Haberland, Denis Baird, Charles Laurin, Stephen Burgess, Jack Bowden, Ryan Langdon, Vanessa Y Tan, James Yarmolinsky, Hashem A Shihab, Nicholas J Timpson, David M Evans, Caroline Relton, Richard M Martin, George Davey Smith, Tom R Gaunt, Philip C Haycock

The MRC IEU OpenGWAS data infrastructure
bioRxiv doi: 10.1101/2020.08.10.244293
Ben Elsworth, Matthew Lyon, Tessa Alexander, Yi Liu, Peter Matthews, Jon Hallett, Phil Bates, Tom Palmer, Valeriia Haberland, George Davey Smith, Jie Zheng, Philip Haycock, Tom R Gaunt, Gibran Hemani

along with any studies and methods that you use.

Background

Mendelian randomization using summary data from genome-wide association studies (GWAS) is an increasingly important tool for appraising causality in hypothesized exposure-outcome pathways. The approach can, however, be technically challenging and time consuming to implement. We have therefore created a new platform built on harmonised summary data from multiple GWAS called MR-Base that greatly simplifies the implementation of Mendelian randomization. In addition to simple lookup requests for individual SNPs across multiple GWAS, MR-Base automates implementation of two-sample Mendelian randomization, including effect allele harmonisation across separate studies, LD pruning to ensure independence of genetic variants and diagnostic and sensitivity analyses. See our methods paper for more details on the design and scope of MR Base. More general information on the principles, assumptions and limitations of Mendelian randomization can be found in the papers recommended on this page.

Intro to MR

Further information

Background on Mendelian randomisation

Davey Smith G, Ebrahim S. 'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol. 2003 Feb;32(1):1-22.

More background on Mendelian randomisation

Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet. 2014 Sep 15;23(R1):R89-98

Haycock PC, Burgess S, Wade KH, Bowden J, Relton C, Davey Smith G. Best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies. Am J Clin Nutr. (in press)

Two-sample Mendelian randomization methods

Pierce BL, Burgess S. Efficient design for Mendelian randomization studies: subsample and 2-sample instrumental variable estimators. Am J Epidemiol. 2013 Oct 1;178(7):1177-84

Accounting for horizontal pleiotropy

Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol. 2015 Apr;44(2):512-25

GWAS studies, databases and consortia

We are grateful to the following GWAS studies, databases and consortia who have kindly made their summary data available:

Moving to OpenGWAS infrastructure

Migrating from the legacy MR-Base database to the OpenGWAS data infrastructure - now over 34k traits and 125 billion genetic associations
We recently removed functionality to access private datasets via this web-app - if you would like to access private datasets please use the R package.
Updated citations handling
Migrating to TwoSampleMR vs 0.5.5
Quick SNP lookup tool now uses ieugwasr::phewas() functionality to improve speed
Added changelog
Various bug fixes

25/10/2020

Various updates

Default 'clump data' changed from 0.01 to 0.001
Fixed the code output for extract_instruments - sometimes was giving the wrong IDs
Providing more info about the exposure and outcome variables selected for each analysis

26/12/2017

New studies added to database

The MR Base database now comprises 1693 studies

01/12/2016

New studies added to database

The MR Base database now comprises 974 studies

22/04/2016

Sensitivity analysis added

Perform leave one out analysis, MR Egger test for horizontal pleiotropy, and heterogeneity tests

24/03/2016

New studies added to database

The MR Base database now comprises 443 studies

22/03/2016

TwoSampleMR R package

The TwoSampleMR R package is now available to install via github.

20/03/2016

MR Base web app is now online

This is an alpha version. Proceed with caution...

19/03/2016

MR Base Terms of Use

The majority of data in MR Base is kindly made available for use by many research organisations and consortia .

These terms (Terms) set out the basis on which the University of Bristol, a body incorporated by Royal Charter under number RC000648 having its administrative offices at Senate House, Tyndall Avenue, Bristol BS8 1TH (University) agrees to provide you with access to the MR-Base platform (Platform) and through the Platform summary data from genome-wide association studies (GWAS Data).

1. Licence: The University grants you a non-exclusive, non-transferable revocable licence to access and use the Platform for private or non-commercial research purposes only. If you wish to access and use the Platform for commercial research purposes, please forward your enquiry to mr-base@bristol.ac.uk.

2. Ownership of MR-base: Subject to paragraph 1, nothing in these Terms grants you any right to, or in, any intellectual property rights of any nature (whether existing now or in the future and whether registered or unregistered) in the Platform.

3. Downloading: You agree that you will not attempt to download GWAS Data from the Platform in bulk or otherwise use the Platform in a way that would or might adversely affect the performance or operation of the Platform for other users.

4. Credits: You agree to cite any use of the Platform in the form set out in the “About” tab. You further agree to observe and comply with any notice requiring you to cite the original source of any GWAS Data in your analyses in the form set out in such notice.

5. Ownership of GWAS Data: GWAS Data may be protected by copyright, database rights and other intellectual property rights around the world. Unless otherwise stated in any notice accompanying any particular GWAS Data, all such rights are reserved by the contributor of the GWAS Data and you agree to observe and comply with any specific licence terms specified by such contributors.

6. Identification of data subjects: You agree not to use, combine, manipulate or transform the GWAS Data in any way that would or might enable you to identify any living individual to which the GWAS Data relates, in breach of data protection laws anywhere in the world.

7. Disclaimers: We do not guarantee that (a) the Platform or any GWAS Data will always be available or interrupted; (b) the Platform or any GWAS Data will be accurate, complete, free from errors or omissions or secure or free from bugs or viruses; or (c) that the result of using the Platform or any GWAS Data will be accurate, adequate or fit for any particular purpose (more general information on the principles, assumptions and limitation of Mendelian randomization can be found on the papers recommended in the “About” tab). Where the Platform contains links to other sites or resources provided by third parties, these links are provided for your information only and you acknowledge that we have no control over the content of those sites or resources. All warranties, representations, conditions and all other terms of any kind whatsoever implied by statute or common law, to the fullest extent permitted by applicable, law, excluding from these Terms.

8. Limitation of liability: You assume sole responsibility for the results obtained from the use of the Platform and any GWAS Data and for conclusions drawn from such use. The University shall have no liability for any damage or other loss whatsoever arising out of or in connection with your use of the Platform or any GWAS Data. The University shall not be liable in any circumstances whether in contract, tort (including for negligence), misrepresentation (whether innocent or negligent), restitution or otherwise for any special, indirect or consequential loss, costs, damages, charges or expenses however arising under these Terms including but not limited to loss of funding or loss of opportunity, goodwill or reputation. Nothing in these Terms excludes or limits any liability which cannot be excluded or limited by applicable law.

9. Suspension and termination: The University may, at its sole discretion, suspend, withdraw, discontinue or change all or any part of the Platform (in respect of any single user, group of users or all users of the Platform) without notice and whether or not arising from any breach of these Terms. The University will not be liable to you in such circumstances.

10. Changes to these Terms: The University may revise these Terms at any time by amending this page. Please check this page from time to time to take notice of any changes, as they will be legally binding on you.

11. Governing law and jurisdiction: These Terms and any claims or disputes arising out of or in connection with them (including non-contractual disputes) shall be governed by the laws of England and Wales whose courts shall have exclusive jurisdiction to settle the same.

Please note that when you log on we keep a record of your email address on our servers to a) ensure that you obtain appropriate access to the GWAS database, b) to compile usage statistics that help us keep this project funded and c) to monitor inappropriate or unfair usage. We do NOT log the queries that are being performed, and we do NOT share your email address with anybody else.

Choosing instruments for the exposure

To use two sample MR to estimate the causal effect of an exposure on an outcome, the first step is to identify SNPs that are robustly associated with the exposure. These summary statistics for these SNPs can be taken from a sample from which there is no data on the outcome.

Please provide instruments by choosing from one of the data sources below, or by uploading your own data. You can choose multiple exposures to be analysed, and multiple instruments per exposure.

Choose instruments

Select exposure source

Manual file upload

NHGRI-EBI GWAS catalog

MR Base GWAS catalog

Gene expression QTLs

Protein level QTLs

Metabolite level QTLs

Methylation level QTLs

Select outcomes for analysis

The MR Base database houses a large collection of summary statistic data from hundreds of GWAS studies. In order to perform two sample MR, the SNPs that were selected for the exposures will be extracted from the outcomes that you select here.

Please select the outcomes that you want to test for being causally influenced by the exposures.

Studies available in MR base

LD clumping

Most two sample MR methods require that the instruments do not have LD between them.

Linkage disequilibrium

Do not check for LD between SNPs

Use clumping to prune SNPs for LD

LD proxies

If a particular exposure SNP is not present in an outcome dataset, should proxy SNPs be used instead through LD tagging?

Use proxies?

Minimum LD Rsq value

Allow palindromic SNPs?

MAF threshold for aligning palindromes

Allele harmonisation

An important step in two sample MR is making sure that the effects of the SNPs on the exposure correspond to the same allele as their effects on the outcome. This is potentially difficult with palindromic SNPs.

Handling reference alleles

All effect alleles are definitely on the positive strand

Attempt to align strands for palindromic SNPs

Exclude palindromic SNPs

Select methods for analysis

Many methods exist for performing two sample MR. Different methods have sensitivities to different potential issues, accommodate different scenarios, and vary in their statistical efficiency.

Submit

Once you have selected exposures, outcomes, and analysis options you are ready to perform the analysis.

Select analysis

Exposure

Outcome

Generate HTML report

Exposure details

Outcome details

Downloads for all analyses

Download harmonised summary statistics

Download MR results

Download leave-one-out sensitivity analysis

Download single SNP MR results

This table shows the MR estimates from each method of the causal effect of the exposure on the outcome. The effects are reported in the units that were used to estimate the SNP effects.

Heterogeneity is the variability in the causal estimates obtained for each SNP (i.e. how consistent is the causal estimate across all SNPs, low heterogeneity suggests increased reliability of MR estimates). It was calculated using each of the different MR methods where possible.

This test attempts to assess the direction of causality, based on estimated variance explained by the SNPs in the exposure and the outcome. The p-value relates to the confidence of whether the hypothesised direction is likely. It does not relate to whether or not there exists a causal relationship.

Important notes

This method is not yet published.

The test is susceptible to assigning the wrong direction of causality in the presence of measurement error, particularly if the true exposure has greater measurement error than the product of the outcome's measurement error and the true causal correlation between the exposure and the outcome.

The method is only designed to work for MR performed between quantitative traits (i.e. please do not use this method if the exposure or the outcome is a binary variable

This analysis requires sample sizes and p-values for the exposures and outcomes.

If the SNPs influence the outcome through a pathway other than the exposure then this is known as horizontal pleiotropy, and is in violation of MR. If the average horizontal pleiotropic effect of the SNPs is on average in one direction then this can bias the MR estimates.

The Egger regression intercept is an estimate of the magnitude of horizontal pleiotropy.

The causal effect of exposure on outcome is estimated using each SNP singly using the Wald ratio, and represented in a forest plot. The MR estimate using all SNPs using the MR Egger and IVW methods are also shown. Formal estimates of heterogeneity are shown in the tables below.

Download PDF of this graph

SNP effects on the outcome are plotted against SNP effects on the exposure (all SNPs with negative effects on the exposure are shown to be positive, with the sign of the effect on the outcome flipped). The slope of the line represents the causal association, and each method has a different line. The Egger estimate is the only line which doesn't automatically pass through the origin.

Download PDF of this graph

Leave-one-out sensitivity analysis is performed to ascertain if an association is being disproportionately influenced by a single SNP. Each black point in the forest plot represents the MR analysis (using IVW) excluding that particular SNP. The overall analysis including all SNPs is also shown for comparison.

Download PDF of this graph

Funnal plot to assess heterogeneity. Less precise estimates (lower values on y-axis) 'funnel' in as they increase in precision. Larger spread suggests higher heterogeneity, which may be due to horizontal pleiotropy. Asymmetry in the funnel plot indicates directional horizontal pleiotropy which can bias many MR methods. MR Egger regression should guard against this. See the pleiotropy test results below for a formal estimate of directional horizontal pleiotropy.

Current status

Information

Contact

Suggest new studies

Credits

Background

Intro to MR

Further information

GWAS studies, databases and consortia

Moving to OpenGWAS infrastructure

Various updates

New studies added to database

New studies added to database

Sensitivity analysis added

New studies added to database

TwoSampleMR R package

MR Base web app is now online

MR Base Terms of Use

Choosing instruments for the exposure

Choose instruments

Select outcomes for analysis

Studies available in MR base

LD clumping

LD proxies

Allele harmonisation

Select methods for analysis

Submit

Select analysis

Exposure details

Outcome details

Downloads for all analyses

Important notes

Analysis log

Analysis R code

Citations

SNP lookup results