Multitrait genome-wide analyses identify new susceptibility loci and candidate drugs to primary sclerosing cholangitis

Han, Younghun; Byun, Jinyoung; Zhu, Catherine; Sun, Ryan; Roh, Julia Y.; Cordell, Heather J.; Lee, Hyun-Sung; Shaw, Vikram R.; Kang, Sung Wook; Razjouyan, Javad; Cooley, Matthew A.; Hassan, Manal M.; Siminovitch, Katherine A.; Folseraas, Trine; Ellinghaus, David; Bergquist, Annika; Rushbrook, Simon M.; Franke, Andre; Karlsen, Tom H.; Lazaridis, Konstantinos N.; McGlynn, Katherine A.; Roberts, Lewis R.; Amos, Christopher I.

doi:10.1038/s41467-023-36678-8

Download PDF

Article
Open access
Published: 24 February 2023

Multitrait genome-wide analyses identify new susceptibility loci and candidate drugs to primary sclerosing cholangitis

Nature Communications volume 14, Article number: 1069 (2023) Cite this article

4244 Accesses
9 Citations
11 Altmetric
Metrics details

Subjects

Abstract

Primary sclerosing cholangitis (PSC) is a rare autoimmune bile duct disease that is strongly associated with immune-mediated disorders. In this study, we implemented multitrait joint analyses to genome-wide association summary statistics of PSC and numerous clinical and epidemiological traits to estimate the genetic contribution of each trait and genetic correlations between traits and to identify new lead PSC risk-associated loci. We identified seven new loci that have not been previously reported and one new independent lead variant in the previously reported locus. Functional annotation and fine-mapping nominated several potential susceptibility genes such as MANBA and IRF5. Network-based in silico drug efficacy screening provided candidate agents for further study of pharmacological effect in PSC.

Dissecting causal relationships between primary biliary cholangitis and extrahepatic autoimmune diseases based on Mendelian randomization

Article Open access 21 May 2024

Regional heritability mapping identifies several novel loci (STAT4, ULK4, and KCNH5) for primary biliary cholangitis in the Japanese population

Article Open access 09 April 2021

GWAS for systemic sclerosis identifies multiple risk loci and highlights fibrotic and vasculopathy pathways

Article Open access 31 October 2019

Introduction

Primary sclerosing cholangitis (PSC) is a chronic, progressive autoimmune disorder of the bile duct^1,2,3. Individuals with PSC are at risk of severe liver problems including a lifetime risk of cholangiocarcinoma of between 5 and 20%⁴. PSC is often associated with inflammatory bowel disease (IBD). Approximately 75% of individuals with PSC have IBD², most commonly ulcerative colitis (UC). Individuals with PSC are also more likely than those without PSC to have other autoimmune diseases, including type 1 diabetes, celiac disease, and thyroid disease. The shared etiology and underlying characteristics of these immune-mediated disorders remain incompletely understood.

Recent genome-wide association studies (GWAS) have identified ~19 loci associated with PSC among individuals of European ancestry^2,5. Association analysis using the Immunochip genotype array data that specifically targeted known autoimmune-related disease regions identified three additional loci influencing PSC risk⁶. The development of PSC can be attributed to a combination of genetic and environmental factors⁷. Individuals with a family history of PSC have an increased risk of developing PSC suggesting that genetic influences play a critical role in susceptibility, which may act in concert with exposure to specific environmental factors. However, the genetic and environmental risk factors are not fully elucidated. As PSC is strongly associated with IBD², examining two traits together may provide better genetic insight into a common genetic etiology^8,9,10,11. Few studies have been conducted to understand the shared genetic underpinning between PSC and other associated medical conditions.

Leveraging publicly available GWAS summary-level data^12,13,14 (Supplementary Data 1, “Methods”), we conducted cross-trait linkage disequilibrium (LD) score regression (LDSR) analysis^15,16 to determine whether there was a shared genetic contribution between polygenic phenotypes for multiple diseases and traits. We explored the directionality and degree of these relationships, and whether the genetic architecture between two traits is correlated or inversely correlated¹⁷. We took advantage of the genetic overlap between traits to identify additional independent genetic variants for PSC alongside five immune-mediated disorders (Supplementary Data 2), highly correlated with PSC: Crohn’s disease¹⁸ (CD), UC¹⁸, IBD¹⁸, lupus¹⁹, and primary biliary cirrhosis²⁰ (PBC) using multitrait analysis of GWAS²¹ (MTAG). Although IBD is the umbrella term that includes CD and UC, we also surveyed the pairwise genetic correlation of PSC for CD and UC, respectively. We then performed functional fine-mapping analyses on the newly identified loci to elucidate potential functional characterization and biological mechanisms affecting PSC susceptibility. Since there is no medication proven to be effective for PSC treatment, we conducted network-based drug–disease proximity analysis to identify potential agents suitable for repurposing to PSC from the previously reported¹³ and newly identified candidate genes in this study.

Results

PSC shows the shared genetic contributions among numerous clinical and epidemiological traits

We investigated the proportion of phenotypic variance explained by all common single-nucleotide polymorphisms (SNPs) for 134 clinical and epidemiological traits to identify potential comorbid conditions and to uncover traits that are causally involved in clinical course and epidemiologic associations using LDSR (“Methods”). We identified numerous traits showing moderate SNP-heritability in the observed scale (h2). The study workflow shown in Fig. 1 summarizes the steps from data preparation to subsequent analyses in the present study. We estimated the SNP-heritability of PSC to be 0.23. Among serologic biomarkers, an increased alkaline phosphatase (ALP) level and conditions such as a blocked bile duct had an estimated SNP-heritability of 0.25. We also examined the magnitude and direction of shared genetic contribution between PSC and 134 polygenic traits of clinical and epidemiological parameters based on the cross-trait genetic correlation (r_g). We identified several polygenic traits showing moderate to strong genetic correlation with PSC at a Bonferroni-corrected significance level of P = 0.05/134 = 3.73 × 10⁻⁴. Since this is hypothesis-based research, we also considered P < 0.05 to identify nominally significant associations that could be examined in future studies. We considered P-values less than the Bonferroni-corrected significance level to be robustly associated in this study and the highlighted traits are displayed in Fig. 2. Our findings reported in Supplementary Data 3 demonstrated that the genetic architecture of PSC susceptibility was positively correlated with that of several immune-related diseases including IBD (r_g = 0.46; P = 4.41 × 10⁻¹³), UC (r_g = 0.62; P = 5.18 × 10⁻¹⁵), CD (r_g = 0.24; P = 4.16 × 10⁻⁴), lupus (r_g = 0.21; P = 0.04), and PBC (r_g = 0.31; P = 3.95 × 10⁻⁴). Overall shared genetic contribution between PSC and a behavior parameter, general risk tolerance defined as the willingness to take risks²², showed a significant negative correlation (r_g = −0.20; P = 1.41 × 10⁻⁴). Increased body mass index (BMI) had a significant negative genetic correlation with PSC susceptibility (r_g = −0.13; P = 1.16 × 10⁻⁴). In epidemiological studies^7,23,24, the association between PSC and cigarette smoking has been inconsistent. Among traits related to smoking behaviors in this study, smoking status²⁵ modeled in previous smokers versus current smokers showed a strong negative genetic correlation with PSC susceptibility (r_g = −0.27; P = 9.17 × 10⁻¹⁰) while smoking initiation²⁶, which is a binary phenotype indicating whether an individual had ever smoked regularly (i.e., never-smokers versus ever-smokers), reported a significant negative genetic correlation with PSC (r_g = −0.20; P = 2.05 × 10⁻⁶).

**Fig. 1: Flow chart of the analytical workflow in this study.**

**Fig. 2: The shared heritability and genetic correlation of PSC among clinical and epidemiological traits.**

MTAG with immune-mediated diseases identifies new PSC-associated loci with evidence of replication

Based on findings from the genome-wide SNP-heritability and pairwise genetic correlation, we restricted our MTAG to the traits for which LDSR has suggested strong associations with PSC susceptibility, showing h2 > 0.20 and |r_g| > 0.20 (“Methods”, Supplementary Information). Five autoimmune-related disorders, CD (r_g = 0.24), UC (0.62), IBD (0.46), lupus (0.20), and PBC (0.31) were selected to identify new PSC risk loci using MTAG (Table 1). Compared to the conventional univariate GWAS, we detected more significant and stronger PSC-specific association signals when implementing MTAG. From MTAG combining PSC with five immune-related diseases; CD, UC, IBD, lupus, and PBC, we discovered seven loci (2p16.1, 4q24, 6q21.2, 6q23.3, 7q32.1, 10q24.2, and 16q22.1) that have not been previously reported or failed to reach the genome-wide significance level and one new independent significant variant of the reported locus (3p21.31) at the genome-wide significance level of 5.0 × 10⁻⁸ (Table 2 and Fig. 3). In addition, our MTAG-identified PSC-specific results confirmed 11 PSC-specific risk-associated variants that have been previously reported in a single-disease GWAS of PSC susceptibility. These include genetic variants from well-established risk loci at 1p36.32, 2q33.2, and 6p21.33-p21.32 that are strongly associated with autoimmune-related diseases^2,20,27,28. We displayed a Manhattan plot for the MTAG-identified PSC-specific GWAS (MTAG_PSC, Fig. 3b) along with that from the previously published single-disease GWAS of PSC² (GWAS_PSC, Fig. 3a). There was no substantial evidence for inflation of both GWAS test statistics (λ_{GWAS_PSC} = 1.06; λ_{MTAG_PSC} = 1.08) shown in Fig. 3c, d, respectively. MTAG-identified genomic risk variants associated with PSC susceptibility with a P < 5.0 × 10⁻⁸ are reported in Supplementary Data 4.

Table 1 Estimate of genetic correlation among autoimmune-related diseases

Full size table

Table 2 The MTAG-identified new associations of PSC

Full size table

**Fig. 3: Manhattan plots and quantile-quantile plots for the single-trait GWAS and the multitrait GWAS of PSC.**

A newly identified association of an intronic variant, rs228614, was detected in MANBA on 4q24 (P_{MTAG_PSC} = 1.71 × 10⁻⁹). Associations at MANBA have been previously reported for multiple sclerosis²⁹, primary biliary cirrhosis³⁰, psoriasis³¹, numerous hematologic traits^32,33,34,35, asthma^36,37, and major depressive disorders³⁸. Another association at rs17780429 between TNFAIP3 and LINC02528 on 6q23.3 showed a strong genetic signal (P_{MTAG_PSC} = 2.24 × 10⁻¹⁰) and many associations at TNFAIP3 have been observed in autoimmune-related diseases^39,40,41,42 and multiple blood-cell traits^34,43. We found a new intergenic variant, rs3757387 between KCP and IRF5 on 7q32.1 (P_{MTAG_PSC} = 2.19 × 10⁻¹⁴). rs3757387 has been previously reported for significant associations with systematic lupus erythematosus among diverse populations⁴⁴ and in a single population^19,45, rheumatoid arthritis in multiple populations^46,47, and Sjögren’s syndrome⁴⁸. An NKX2-3 intronic variant, rs791168 on 10q24.2, was associated with PSC susceptibility and has been reported in many autoimmune-related and blood-cell traits¹³ (P_{MTAG_PSC} = 1.33 × 10⁻⁸). LocusZoom regional plots of genome-wide associations for these newly identified loci are provided in Supplementary Fig. 1.

To assess whether our MTAG results were robust to strong genetic correlation and clinical relevance among IBD, UC, and CD, we repeated our MTAG analysis only including PSC, CD, UC, lupus, and PBC (MTAG_PSC⊥IBD) as a sensitivity analysis. The results from the MTAG-identified PSC-specific model excluding IBD were very similar to those from the inclusion model (MTAG_PSC) (Table 2 and Supplementary Fig. 2).

To replicate the new MTAG-identified PSC-specific associations, we downloaded GWAS summary statistics from FinnGen¹⁴ and GWAS Catalog¹³, which are independent GWAS from the discovery phase (Supplementary Data 2). Since we were interested in replicating eight new associations (seven newly identified loci and one independent significant variant in the reported locus), we did not apply multiple testing corrections. We replicated four PSC-specific associations (MTAG_PSC_R), rs6787808 in QRICH1 (P_{MTAG_PSC_R} = 1.79 × 10⁻²), rs228614 in MANBA (P_{MTAG_PSC_R} = 2.05 × 10⁻²), rs3757387 between KCP and IRF5 (P_{MTAG_PSC_R} = 1.39 × 10⁻⁸), and rs791168 in NKX2-3 (P_{MTAG_PSC_R} = 1.20 × 10⁻³) at the nominal significance level of 0.05 (Table 2 and Supplementary Fig. 2).

Fine-mapping and functional annotation nominates candidate variants within MTAG-identified loci

To pinpoint genomic risk loci and prioritize susceptibility variants underlying the MTAG-identified PSC-specific GWAS associations by functional annotation, positional, expression quantitative trait loci (eQTL), and chromatin interaction mappings, we exploited Functional Mapping and Annotation of GWAS (FUMA GWAS)⁴⁹ using LD structure based on European ancestry of 1000 Genome Project phase 3 (“Methods”). We prioritized 406 unique genes from 20 PSC susceptibility loci reported in Supplementary Data 5 that functionally mapped and annotated using MTAG-identified GWAS, of which 109 genes were identified by position mapping of deleterious coding variants with the combined annotation-dependent depletion (CADD) score (posMapMaxCADD ≥ 12.37)⁵⁰ (Supplementary Data 6). Out of 406 prioritized genes, 48 genes (12%) were detected by eQTL associated with the expression of 14 immune cell types⁵¹. In the chromatin interaction mapping, 278 genes (69%) are mapped to the regions interacting with the promoter of the listed gene and of which 90 genes (32%) were found in the liver tissue in which the chromatin interaction is observed (Supplementary Data 6). Either chromatin interactions or eQTLs within PSC risk loci (Supplementary Data 5) were shown on chromosomes 2, 3, 4, 6, 7, 11, 16, 19, and 21, respectively (Supplementary Figs. 3). Then, 158 genes were mapped by both eQTLs and chromatin interactions including IRF5 and TNPO3 genes (in red in Supplementary Fig. 3e) on the 7q32.1. In addition, we explored immune-related genes among 406 PSC-specific susceptibility genes prioritized by position, eQTL, or chromatin interaction mapping using InnateDB⁵² (“Methods”). We found five immune-related genes including IRF5 and SMO (7q32.1) and HAS3, SNTB2, and VPS4A (16q22.1), within newly identified loci that have not been previously reported (Supplementary Data 7).

To functionally characterize the 329 independent significant variants within 20 genomic risk loci generated from FUMA, we performed an integrated variant functional annotation approach using the Functional Annotation of Variants Online Resource (FAVOR) platform^53,54,55 and the multidimensional annotation class integrative estimator^56,57 (MACIE). Out of 168 noncoding genes, we observed 14 more likely deleterious genes (CADD PHRED ≥ 12.37) and 8 and 6 genes on promoter and permissive enhancer sites, respectively. (Supplementary Data 8 and 9). Of the SNPs investigated with MACIE, we find 80 variants with a regulatory class prediction greater than 95%. That is, these variants are highly likely to tangibly affect the behavior of certain gene expressions, most often nearby genes. We find four variants with a conserved class prediction greater than 95%, and three of these variants also possess a regulatory prediction greater than 95%. That is, the four variants are highly likely to belong to the class of evolutionarily conserved variants that are found in many living beings. The full predictions for each SNP can be found in Supplementary Data 10.

To nominate the candidate causal variants from each locus for further functional analysis, we implemented fine-mapping of MTAG-identified loci using FINEMAP⁵⁸ and surveyed credible sets of plausible causal variants based on posterior inclusion probability (PIP). We then applied Conditional and Joint Analysis (COJO) using GCTA⁵⁹ to refine independent associations with prioritized risk loci. Based on the single-SNP PIP with each locus, we identified 32 variants falling into the 95% credible set across eight MTAG-identified GWAS loci (Supplementary Data 11). We found that eight MTAG-identified PSC risk loci explained at least two independent association signals; 2p16.1 locus harboring PUS10, with five independent variants, 3p21.31 (QRICH1) and 4q24 (MANBA) with five variants, 6p21.2 (KCNK17) with two variants, 6q23.3 (TNFAIP3) and 7q32.1 (IRF5) with five variants, 10q24.2 (NKX2-3) with three variants and 16q22.1 (TANGO6) with two variants, respectively. There is no additional genome-wide significant association from GCTA-COJO analysis at the genome-wide significant level of 5 × 10⁻⁸.

eQTL-based colocalization prioritizes PSC susceptibility genes from the MTAG-identified new loci

We carried out eQTL-based colocalization analysis to identify allelic-specific effects on gene expression and to examine colocalization of association signals from new MTAG-identified PSC risk-associated findings using eQTL summary statistics of 49 tissue types from GTEx v8. Among seven MTAG-identified new risk loci (2p16.1, 4q24, 6p21.2, 6q23.3, 7q32.1, 10q24.2, 16q22.1), colocalization nominated three candidate genes, MANBA at 4q24, IRF5 at 7q32.1, and NKX2-3 at 10q24.2, contributing to PSC risk (Supplementary Data 12). Notably, a newly MTAG-identified locus, IRF5, displayed the highest posterior probability scores indicating that both PSC and each of the 30 tissues are associated and share a single functional variant (PP4 > 0.80) using coloc⁶⁰ package (Fig. 4, Supplementary Fig. 4, Supplementary Data 12).

**Fig. 4: Functional validation of the MTAG-identified PSC-specific candidate genes.**

We selected 406 prioritized genes to detect relevant groups of related genes involved in the regulation of specific biological pathways. Using STRING Protein–Protein Interaction (PPI) networks⁶¹, these candidate genes are highly enriched for protein–protein interactions (P < 1.00 × 10⁻¹⁶), with enrichment at false discovery rate (FDR) < 0.05 of the following pathways: immune receptor activity (FDR = 3.84 × 10⁻²), beta-2-microglobulin binding (1.10 × 10⁻²), cytokine-mediated signaling pathway (1.58 × 10⁻¹³), interferon-gamma-mediated signaling pathway (1.13 × 10⁻¹¹), T-cell receptor signaling pathway(2.21 × 10⁻¹¹), immune response-activating cell surface receptor signaling pathway (2.65 × 10⁻⁹), interleukin-7-mediated signaling pathway (9.21 × 10⁻⁹), TNFR2 noncanonical NF-kB pathway (7.90 × 10⁻³), Th17 cell differentiation (2.63 × 10⁻⁶), and Th1 and Th2 cell differentiation (1.94 × 10⁻⁵) (Supplementary Data 13, Supplementary Fig. 5). For comparison, we implemented enrichment analysis using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) Bioinformatics Resources^62,63 on the same candidate 406 genes. We observed T-cell receptor signaling pathway (FDR = 5.82 × 10⁻⁷), antigen processing and presentation (8.18 × 10⁻¹⁵), immunoglobulin production involved in immunoglobulin mediated immune response (6.30 × 10⁻¹⁴), cytokine Signaling in Immune system (3.48 × 0⁻⁵), interferon Signaling (2.62 × 10⁻⁹), and interferon alpha/beta signaling (6.60 × 10⁻⁴) (Supplementary Data 14).

In addition, we scrutinized the PPI network associated with each gene prioritized from newly MTAG-identified loci and found three genes (MANBA, IRF5, and NKX2-3) to be highly enriched for PPI at FDR < 0.05. Each prioritized gene of MANBA, IRF5, and NKX2-3 reported a PPI P-value of 5.16 × 10⁻¹⁴, 1.00 × 10⁻¹⁶, and 1.13 × 10⁻⁹, respectively. We observed B and T-cell receptors, chemokine, C-type lectin receptor, cytosolic DNA-sensing, HIF-1, IL-17, JAK-STAT, MAPK, metabolic, NF-kappa B, NOD-like receptor, PD-L1 expression and PD-1 checkpoint in cancer, RIG-I-like receptor, th1-th2 cell differentiation, th17 cell differentiation, thyroid hormone, TNF, and toll-like receptor signaling pathways in the KEGG pathways at FDR < 0.05 using STRING PPI networks (Supplementary Data 15, Supplementary Fig. 6).

Network-based proximity predicts drug-PSC associations for drug repurposing

Although there is no medication proven to treat PSC, ursodeoxycholic acid (UDCA) is a recommended treatment increasing the bile flow as well as preventing damage to liver cells. While UDCA is used to treat PBC and radiolucent gallstones with a functioning gall bladder, it does not appear to improve survival or reduce the need for liver transplant in PSC patients. From in silico network-based proximity analysis⁶⁴, we estimated the shortest distance (d) between drug targets and PSC candidate genes (Supplementary Data 16, “Methods”) and the relative proximity measure(z) capturing the statistical significance of distance between drug and disease protein derived from a permutation test (Table 3, Supplementary Data 17, Supplementary information). The more negative the relative proximity between drug and disease, the closer the genetic relationship between them⁶⁴. We identified many agents at the relative proximity threshold of −0.15, implying potential therapeutic effects on PSC. The top-ranked drugs suggestive for PSC included denileukin diftitox, interleukin-2-alpha binder used for cutaneous T-cell lymphoma (z = −5.443); vitamin E (z = −1.918); MLN0415, a small molecule IKK2 inhibitor downregulating the expression of a number of inflammatory proteins (z = −1.648). The proximity of UDCA showed 0.170 on PSC indicating that it may not be a genetically promising candidate drug for PSC. The FUMA platform facilitates gene mapping to the DrugBank database via GENE2FUNC reported in Supplementary Data 18. While network-based proximity predicts drug association based on the distance between drug targets and candidate genes, FUMA provides the gene table mapped to the drug database based on the prioritized genes by different mapping methods such as position, eQTL, and chromatin interaction.

Table 3 Network-based in silico drug repurposing on PSC

Full size table

Discussion

We leveraged publicly available GWAS summary statistics to investigate the shared genetic architecture of PSC with a variety of clinical and epidemiological traits and to identify additional PSC-risk loci. We first scrutinized the patterns of genomic overlap between PSC and numerous phenotypes using LDSR. Cross-trait LDSR estimated the genetic correlation between traits to gain insights into common etiologies^15,16. We identified significant phenotypic associations between different polygenic traits and PSC. The findings of this study enabled us to confirm previously well-established comorbid conditions and to identify polygenic traits for further study. Complementary approaches such as MTAG, which is a joint association analysis of genetically correlated traits, helped us to discover new susceptibility variants influencing PSC. In addition, LDSR-identified polygenic traits indicating a high correlation with PSC can be applied in Mendelian randomization analysis to unveil further causal relationships between PSC and the traits of interest.

We observed a significant positive correlation between the genomic architecture of each autoimmune-related disease and that of PSC using LDSR. In several genetic studies, PSC is driven by shared and distinct genetic determinants compared to immune-mediated diseases^{7,19,27,65,66}. The shared structure of the genetic susceptibility to PSC is notably overlapped with immune-mediated disorders such as CD, IBD, lupus, PBC, and UC²⁷, which have well-established associations with PSC⁶⁷. In addition, these immune-mediated disorders showed large proportions of phenotypic variance explained by all common SNPs in this study.

Several epidemiological studies have reported inverse associations between smoking and PSC risk^{7,23,24,68,69}. Our study found a strongly protective genetic correlation between the genomic architecture of smoking status modeled in former smokers versus current smokers and that of PSC, suggesting that the genetic contribution of current smoking is associated with a decreased risk of PSC compared to that of former smoking. Although it failed to meet the Bonferroni-corrected significance level of 3.73 × 10⁻⁴, the smoking cessation trait modeled in former smokers versus current smokers²⁶ showed a consistent association with PSC implying that the genetic contribution of current smoking is associated with a decreased risk of PSC compared to that of former smoking²³. The smoking initiation trait modeled in never-smokers versus ever-smokers²⁶ showed a significant negative association with PSC suggesting that PSC risk among current and former smokers is significantly lower than that among never-smokers²³. Smoking promotes chronic epithelial and tissue injury through chronic airway inflammation^70,71 and the most common causes of chronic inflammation include immune-mediated disorders which could potentially contribute to PSC development. Therefore, the shared association of PSC with smoking behaviors makes disentangling such effects challenging.

Applying an orthogonal genomics-driven method complementing clinical epidemiologic research of PSC, we confirmed a link between PSC risk and elevated BMI and diabetes^{7,72,73,74,75}. However, clinical studies have shown inconsistent associations between cardiovascular disease and PSC^75,76. Pairwise genetic correlation between PSC and cardiovascular risk demonstrated a negative association at the nominal significance level of 0.05. We also identified several suggestive polygenic traits for which the pairwise genetic correlations were nominally significant at P < 0.05. We observed a nominally significant inverse genetic correlation between PSC and several serologic biomarkers including C-reactive protein, glucose, HbA1c, red blood cell distribution width, reticulocyte count, and triglycerides while alkaline phosphatase and sex hormone binding globulin were positively correlated with PSC risk. These findings through LDSR show good concordance with previous clinical and genetic epidemiologic studies^7,75,77.

Implementing MTAG, we discovered seven new susceptibility loci that have not been previously reported in GWAS_PSC and, of these, we replicated three lead associations in other GWAS independent from the discovery phase. Two of the new MTAG PSC loci, MANBA on 4q24 and IRF5 on 7q32.1 were previously shown to be associated with several hematology-related traits and immune-mediated disorders^{20,44,45,46,47,48}. The previously identified phenotypes have also been reported in PSC. In addition, we prioritized candidate genes for PSC susceptibility through MTAG and inferred biological pathways identified through eQTL-colocalization analyses. PPI networks showed that candidate genes were often part of biological pathways involving metabolic processes and immune response.

Recently, the identification of targets for drug repurposing (repositioning) using genome-wide approaches has become popular²⁰. In this study, we implemented network-based in silico drug efficacy screening to predict agents potentially suitable for repurposing to PSC. Generally, UDCA is recommended for the treatment of cholestatic liver diseases including PSC, but it does not show any effect on the progression and survival of PSC patients⁷⁸. Interestingly, the proximity of UDCA shows that it may not be a genetically promising candidate drug for PSC. In clinical trials in the U.S., UDCA did not improve the management of PSC⁷⁹ and its use has been discouraged in the U.S. providers⁸⁰, indicating a correct prediction of our drug screening analysis. The identified candidate drugs are relevant to lymphoma (Denileukin diftitox, Galiximab), various cancers (Keyhole limpet hemocyanin, TG4010, Girentuximab, Amonafide), psoriasis and psoriatic disorders (Tapinarof), vitamin E, IBD (Declopramide), metabolic disorders (Girentuximab), rheumatoid arthritis, liver cancer (Becatecarin), chronic hepatitis C virus (HCV) (Sofosbuvir, ANA971, Isatoribine). Poch et al. reported a single-cell atlas of intrahepatic T-cell landscape in PSC⁸¹. The top-ranked drug, Denileukin diftitox, which is involved in the regulation of immune tolerance by controlling regulatory T-cells activity, could be a candidate agent for further study of pharmacological effect.

Integration, harmonization, and optimization of the existing large-scale GWAS datasets have become a popular analytical strategy to identify new genetic associations. However, access to individual-level GWAS datasets remains limited due to data use restrictions. Although LDSR can quantify the shared genetic architecture of traits having undergone GWAS analysis without requiring GWAS individual-level data, it assumes an absence of population stratification in the underlying summary statistics of the tested traits and necessitates the incorporation of GWAS data from populations expected to have homogeneous genetic structure. Furthermore, GWAS summary statistics with small sample sizes or low SNP-heritability are not amenable to LDSR. One caveat of implementing LDSR is that nonsignificant associations could be due to limited statistical power, rather than a lack of shared heritability, as cross-trait LDSR requires larger sample sizes of GWAS summary-level data to achieve equivalent standard error compared to methods that use individual-level data¹⁵. Another limitation of LDSR is that the analysis includes only common genetic variants with MAF >0.01 and therefore fails to capture shared heritability due to underlying rare variants between PSC and multiple polygenic traits.

MTAG²¹ can substantially improve statistical power for detecting susceptibility loci relative to separate GWAS for the traits tested and allows potential sample overlap in numerous trait-specific summary statistics from large-scale cohort GWAS. However, replication or validation analysis is recommended to assess the credibility of each SNP association when MTAG is applied to low-powered GWAS or to GWAS that are considerably heterogeneous in statistical power. Since MTAG uses overlapping SNPs across all GWAS summary statistics, combining summary statistics with a smaller number of SNPs with those with a larger number of SNPs can reduce statistical power.

In conclusion, our findings from LDSR confirm the associations between immune-mediated disorders and PSC, and epidemiological parameters associated with PSC susceptibility. We also identified and replicated the newly MTAG-identified PSC risk loci and through eQTL-colocalization analysis helped to prioritize candidate genes for PSC susceptibility. This study emphasizes the strong evidence that exists for the shared genetic underpinning among immune-mediated diseases. While PSC GWAS have identified a few risk-associated variants, the function and identity of the causal variants are not fully explored. To address the impact of PSC risk-associated variants in the immune system and within less-well-established noncoding regions, we highlighted several in silico functional approaches to map and prioritize the variants identified. Furthermore, we exploited an immune-related gene database for deciphering how PSC risk-associated variants may alter immune networks. We also utilized the integrative functional annotations platform to functionally characterize the prioritized genes including both coding and noncoding genes, which provide numerous information on variant and indel functional annotations. Since there is no medication proven to treat PSC, we predicted many potential agents at the relative proximity capturing the statistically significant relationship between a potential drug and putative disease-associated proteins. We further carried out gene mapping to the drug database with the broad range of genes prioritized by position, eQTL, and chromatin interaction mapping. These analytical pipelines, which utilize activity maps of noncoding regions help us pinpoint their role in specific cell types. These findings can provide better functional insight into the genetic etiology of PSC susceptibility and improve our understanding of how PSC risk-associated variants alter the immune system. Finally, future studies using causal inference approaches such as Mendelian randomization or genetic instrumental variable methods may help to elucidate the causal relationship between the risk of PSC and other potential candidate phenotypes to reveal surrogate biomarkers that may improve the predictive power of polygenic risk scores.

Methods

Ethics statement

All participants for each GWAS were recruited following protocols approved by the local Ethics Committee/Institutional Review Boards. Written informed consent was obtained from each participant included in the study. All methods were performed in accordance with the ethical guidelines of the 1975 Declaration of Helsinki.

GWAS summary statistics and imputation

We obtained the GWAS summary statistics for PSC² and 134 clinical and epidemiological traits from existing data resources^12,13. More details are shown in Supplementary Data 1 and Supplementary Information. We restricted the study populations to individuals of European ancestry to align with the homogeneous ancestry background of participants in GWAS of the traits tested in our downstream analyses. To enhance adequate statistical power in this study, GWAS summary statistics were imputed using the SSimp software⁸² (v.0.5.6; https://github.com/zkutalik/ssimp_software) when the number of SNPs in a trait was considerably smaller compared to that in other traits, thus becoming less informative. Detailed methods are provided in Supplementary Information.

Analyses of multitrait GWAS

We estimated SNP-heritability (h2) on the observed scale and pairwise genetic correlation (r_g) between multiple polygenic traits using LDSR^{8,9,10,11,15,16} (v1.0.1; https://github.com/bulik/ldsc). We conservatively set the test-wise significance level using Bonferroni correction to be 0.05/134, adjusting for the analysis of 134 polygenic traits in total (Supplementary Information).

The commonly used conventional GWAS approach is to analyze the univariate association test for a single trait/phenotype. This does not permit leveraging of genetic information from other polygenic traits. Integrating associations from other traits highly correlated with PSC can improve the statistical power to identify new polygenic variants^21,83,84,85. We conducted MTAG (v1.0.8; https://github.com/JonJala/mtag) combining PSC with immune-mediated disorders selected by h2 > 0.20 and |r_g| > 0.20. MTAG was modeled for PSC versus five polygenic autoimmune-related traits: CD, UC, IBD, lupus, and PBC (MTAG_PSC). Additionally, we performed a sensitivity analysis excluding IBD (⊥IBD) from the MTAG analysis (MTAG_PSC⊥IBD) since IBD is the umbrella term mainly comprising of medical conditions under which both CD and UC fall⁸⁶. The sensitivity analysis included only five autoimmune-related diseases; PSC, CD, UC, lupus, and PBC.

To replicate MTAG-identified PSC risk-associated new loci, we implemented MTAG (MTAG_PSC_R) using PSC (FinnGen phenocode:K11_CHOLANGI), CD (K11_CD_NOUC), UC (K11_UC_NOCD), IBD (K11_IBD), and lupus (M13_SLE) from FinnGen repository¹⁴, and PBC⁸⁷ from GWAS catalog, which are independent of those in the discovery phase. Details are reported in Supplementary Data 2.

Characterization of genomic risk loci using FUMA

We mapped the genomic regions of associations by the most significant variants using FUMA GWAS⁴⁹ (v1.4.1; https://fuma.ctglab.nl/) platform computing LD structure, annotating functions to SNPs, and prioritizing candidate genes from MTAG-derived summary statistics⁴⁹. To define genomic risk loci for MTAG-identified PSC susceptibility, we used linkage disequilibrium structure based on the European ancestry of the 1000 Genome Project phase 3. Genomic risk loci and the subsets of significant SNPs within each locus were identified using the SNP2GENE function applying the default thresholds: (1) independent significant SNPs, defined as P < 5 × 10⁻⁸ and independent from each other at r² ≥ 0.6 (2) lead SNPs, defined as independently significant SNPs and independent from each other at r² ≥ 0.1; (3) genomic risk loci, defined by merging lead SNPs within physically overlapped LD blocks and all SNPs in linkage disequilibrium of r² ≥ 0.6 with one of the independent SNPs. Prioritized susceptibility variants from MTAG GWAS were mapped by positional, eQTL, and chromatin interaction mappings using the FUMA SNP2GENE function with default settings. Finally, FUMA maps the prioritized genes given by the SNP2GENE function to the drug database (DrugBank⁸⁸) via the GENE2FUNC function in the FUMA platform. The gene table mapped to the DrugBank database provides gene information and the relevant DrugBank IDs that can be found at https://go.drugbank.com/drugs with the details.

Functional annotation within immune-related genes using InnateDB Innate Immunity Genes

We examined 406 prioritized genes to nominate innate immune genes associated with PSC using 7476 genes involved in innate immune responses from the InnateDB⁵² portal. InnateDB provides the manually-curated list of genes and signaling responses involved in human innate immunity from publicly available databases including the Immunology Database and Analysis Portal (ImmPort) system, Immunogenetic Related Information Source (IRIS), MAPK/NFKB Network, and Immunome Database. The details can be found elsewhere at https://www.innatedb.com/redirect.do?go=resourcesGeneLists.

Integrative multi-omic annotation analysis

We annotated the 406 prioritized genes using FAVOR platform^53,54,55 (v2.0; https://favor.genohub.org/) which is an open-access variant functional annotation portal for whole WGS/WES data. FAVOR provides functional annotation information of 8,812,917,339 SNVs across the human genome and 79,997898 indels from the Trans-Omics for Precision Medicine (TOPMed) BROVO variant set (Build GRCh38) based on a collection of databases such as variant category, evidence of chromatin, protein function, conservation, and Clinvar information. The details have been described elsewhere⁵⁵.

Annotation-informed function prediction

We utilized the multidimensional annotation class integrative estimator^56,57 (MACIE, https://github.com/ryanrsun/lungCancerMACIE/tree/master/MACIE_pipeline) to analyze functional annotation data and understand the possible mechanistic roles of individual SNPs. For each variant, MACIE utilizes a generalized linear mixed model that specifies annotation values as outcomes and unobserved latent functional classes as predictors. The posterior probabilities of these unobserved classes are then calculated for each SNP to estimate the probabilities of possessing certain functions. The calculation proceeds through an expectation-maximization (EM) algorithm until convergence. The final posterior expected value of a class is taken as the MACIE prediction. Specifically, we applied MACIE with two latent classes, (1) regulatory class informed by 28 annotations such as H3K27Ac levels and (2) conserved class informed by eight phylogenetic conservational algorithms. Predictions were only made for noncoding variants.

Fine-mapping and gene-based enrichment analyses

We implemented FINEMAP⁵⁸ (v1.4.1; http://www.christianbenner.com) to survey credible sets of plausible causal variants based on the posterior inclusion probability (PIP). We carried out the FINEMAP package with the options “--sss” to specify the “fine-mapping with shotgun stochastic search” and “--n-causal-snps 5” to set the maximum number of causal variants allowed within a locus to 5. We performed Conditional and Joint analysis using GCTA⁵⁹ (v1.9.4; https://cnsgenomics.com/software/gcta/) to select independent association signals within the prioritized risk loci with the option “--cojo-cond”.

The Genotype-Tissue Expression (GTEx_v8)⁸⁹ database consists of data from 49 normal tissues from 838 donors (Supplementary Data 5, Supplementary information). Colocalization between the seven MTAG_PSC associations within the newly identified loci and eQTL signals was calculated using the coloc package (v5.1.0; https://cran.r-project.org/web/packages/coloc/)⁶⁰. We focused on the colocalizations when coloc suggested a plausible posterior probability that both PSC and a tissue from GTEx_v8 are associated and share a single functional variant (PP4 > 0.80).

We utilized the STRING Database⁶¹ (v11.5; https://string-db.org/cgi/input?sessionId=bmwWOuutn8ZR) to explore the functional enrichment of protein–protein interaction (PPI) networks and to scrutinize the enrichment of various pathways among the prioritized genes (proteins). In addition, we surveyed the DAVID Bioinformatics Resources^62,90 (v6.8; https://david.ncifcrf.gov/) to look for enrichment of various functional annotations on the 416 prioritized genes after excluding 9 overlapped genes from 19 newly MTAG-identified and previously reported PSC risk-associated genes and 406 genes mapped from position mapping, eQTL mapping, and chromatin interaction mapping provided from FUMA.

Network-based proximity between drugs and disease-identified proteins for drug repurposing

Drug–disease proximity measures, distance (d), and the corresponding relative proximity (z), quantifying the network-based relationship between drugs and proteins encoded by genes associated with the disease while correcting for the known biases of the interactome⁶⁴, were estimated (Supplementary Information). To elucidate the effectiveness of proximity as an unbiased measure of drug–disease relatedness, we defined a drug to be proximal to a disease when the closest proximity, z ≤ −0.15, and not proximal otherwise⁶⁴. We downloaded detailed drug data with comprehensive drug target information from the DrugBank database (v5.1.9, released 2022-01-04)⁸⁸.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The summary statistics of PSC from MTAG are publicly available at https://github.com/biomedicaldatascience/PSC_MTAG. The GWAS summary-level data analyzed in this study are available in the NHGRI-EBI GWAS Catalog [https://www.ebi.ac.uk/gwas/] and the MRC IEU OpenGWAS database [https://gwas.mrcieu.ac.uk/] for previously published GWAS summary statistics, Neale’s lab repository for UK Biobank GWAS summary statistics [https://github.com/Nealelab/UK_Biobank_GWAS], and FinnGen repository for Finnish Biobank GWAS summary statistics r6 [https://finngen.gitbook.io/documentation/v/r6/data-download]. The accessible links and reference information for the GWAS summary-level data (mapped to Genome Assembly GRCh37) used in this study can be found in Supplementary Data 1 and 2. Non-commercial DrugBank datasets (v5.1.9) are available and access can be obtained by the academic license [https://go.drugbank.com/releases/latest]. The data including all variant-gene cis-eQTL associations tested in each tissue (GTEx v8) are available in a requester pays bucket on Google Cloud Platform (GCP) [https://gtexportal.org/home/datasets; https://console.cloud.google.com/storage/browser/gtex-resources]. The immune-related genes can be obtained in the InnateDB portal [https://www.innatedb.com/redirect.do?go=resourcesGeneLists].

References

Karlsen, T. H., Schrumpf, E. & Boberg, K. M. Primary sclerosing cholangitis. Best. Pr. Res. Clin. Gastroenterol. 24, 655–666 (2010).
Article CAS Google Scholar
Ji, S. G. et al. Genome-wide association study of primary sclerosing cholangitis identifies new risk loci and quantifies the genetic relationship with inflammatory bowel disease. Nat. Genet. 49, 269–273 (2017).
Article CAS PubMed Google Scholar
Chung, B. K. & Hirschfield, G. M. Immunogenetics in primary sclerosing cholangitis. Curr. Opin. Gastroenterol. 33, 93–98 (2017).
Article CAS PubMed Google Scholar
Blechacz, B. Cholangiocarcinoma: current knowledge and new developments. Gut Liver 11, 13–26 (2017).
Article PubMed Google Scholar
Melum, E. et al. Genome-wide association analysis in primary sclerosing cholangitis identifies two non-HLA susceptibility loci. Nat. Genet. 43, 17–19 (2011).
Article CAS PubMed Google Scholar
Liu, J. Z. et al. Dense genotyping of immune-related disease regions identifies nine new risk loci for primary sclerosing cholangitis. Nat. Genet. 45, 670–675 (2013).
Article CAS PubMed PubMed Central Google Scholar
Andersen, I. M. et al. Effects of coffee consumption, smoking, and hormones on risk for primary sclerosing cholangitis. Clin. Gastroenterol. Hepatol. 12, 1019–1028 (2014).
Article PubMed Google Scholar
Byun, J. et al. The shared genetic architectures between lung cancer and multiple polygenic phenotypes in genome-wide association studies. Cancer Epidemiol. Biomark. Prev. 30, 1156–1164 (2021).
Article Google Scholar
Pettit, R. W. et al. The shared genetic architecture between epidemiological and behavioral traits with lung cancer. Sci. Rep. 11, 17559 (2021).
Article ADS CAS PubMed Google Scholar
Ostrom, Q. T. et al. Partitioned glioma heritability shows subtype-specific enrichment in immune cells. Neuro Oncol. 23, 1304–1314 (2021).
Article CAS PubMed PubMed Central Google Scholar
Byun, J. et al. Shared genomic architecture between COVID-19 severity and numerous clinical and physiologic parameters revealed by LD score regression analysis. Sci. Rep. 12, 1891 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Elsworth, B. et al. The MRC IEU OpenGWAS data infrastructure. Preprint at bioRxiv. https://doi.org/10.1101/2020.08.10.244293 (2020).
Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–d1012 (2019).
Article CAS PubMed Google Scholar
FinnGen. Documentation of R6 release, vol. 2022 (2022).
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
Article CAS PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Article CAS PubMed PubMed Central Google Scholar
van Rheenen, W., Peyrot, W. J., Schork, A. J., Lee, S. H. & Wray, N. R. Genetic correlations of polygenic disease traits: from theory to practice. Nat. Rev. Genet. 20, 567–581 (2019).
Article PubMed Google Scholar
de Lange, K. M. et al. Genome-wide association study implicates immune activation of multiple integrin genes in inflammatory bowel disease. Nat. Genet. 49, 256–261 (2017).
Article PubMed PubMed Central Google Scholar
Bentham, J. et al. Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus. Nat. Genet. 47, 1457–1464 (2015).
Article CAS PubMed Google Scholar
Cordell, H. J. et al. An international genome-wide meta-analysis of primary biliary cholangitis: novel risk loci and candidate drugs. J. Hepatol. 75, 572–581 (2021).
Article CAS PubMed PubMed Central Google Scholar
Turley, P. et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat. Genet. 50, 229–237 (2018).
Article CAS PubMed PubMed Central Google Scholar
Karlsson Linner, R. et al. Genome-wide association analyses of risk tolerance and risky behaviors in over 1 million individuals identify hundreds of loci and shared genetic influences. Nat. Genet. 51, 245–257 (2019).
Article CAS PubMed Google Scholar
Wijarnpreecha, K. et al. Association between smoking and risk of primary sclerosing cholangitis: a systematic review and meta-analysis. U. Eur. Gastroenterol. J. 6, 500–508 (2018).
Article Google Scholar
Mitchell, S. A. et al. Cigarette smoking, appendectomy, and tonsillectomy as risk factors for the development of primary sclerosing cholangitis: a case control study. Gut 51, 567–573 (2002).
Article CAS PubMed PubMed Central Google Scholar
Loh, P. R., Kichaev, G., Gazal, S., Schoech, A. P. & Price, A. L. Mixed-model association for biobank-scale datasets. Nat. Genet. 50, 906–908 (2018).
Article CAS PubMed PubMed Central Google Scholar
Liu, M. et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat. Genet. 51, 237–244 (2019).
Article CAS PubMed PubMed Central Google Scholar
Ellinghaus, D. et al. Analysis of five chronic inflammatory diseases identifies 27 new associations and highlights disease-specific patterns at shared loci. Nat. Genet. 48, 510–518 (2016).
Article CAS PubMed PubMed Central Google Scholar
Qiu, F. et al. A genome-wide association study identifies six novel risk loci for primary biliary cholangitis. Nat. Commun. 8, 14828 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
International Multiple Sclerosis Genetics Consortiumet al. Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature 476, 214–219 (2011).
Article ADS Google Scholar
Cordell, H. J. et al. International genome-wide meta-analysis identifies new primary biliary cirrhosis risk loci and targetable pathogenic pathways. Nat. Commun. 6, 8019 (2015).
Article CAS PubMed Google Scholar
Zuo, X. et al. Whole-exome SNP array identifies 15 new susceptibility loci for psoriasis. Nat. Commun. 6, 6793 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Chen, V. L. et al. Genome-wide association study of serum liver enzymes implicates diverse metabolic and liver pathology. Nat. Commun. 12, 816 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Emilsson, V. et al. Co-regulatory networks of human serum proteins link genetics to disease. Science 361, 769–773 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Sinnott-Armstrong, N. et al. Genetics of 35 blood and urine biomarkers in the UK Biobank. Nat. Genet. 53, 185–194 (2021).
Article CAS PubMed Google Scholar
Kachuri, L. et al. Genetic determinants of blood-cell traits influence susceptibility to childhood acute lymphoblastic leukemia. Am. J. Hum. Genet. 108, 1823–1835 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zhu, Z. et al. Shared genetics of asthma and mental health disorders: a large-scale genome-wide cross-trait analysis. Eur. Respir. J. 54, 1901507 (2019).
Article PubMed Google Scholar
Johansson, A., Rask-Andersen, M., Karlsson, T. & Ek, W. E. Genome-wide association analysis of 350 000 Caucasians from the UK Biobank identifies novel loci for asthma, hay fever and eczema. Hum. Mol. Genet. 28, 4022–4041 (2019).
Article CAS PubMed PubMed Central Google Scholar
Peyrot, W. J. & Price, A. L. Identifying loci with different allele frequencies among cases of eight psychiatric disorders using CC-GWAS. Nat. Genet. 53, 445–454 (2021).
Article CAS PubMed PubMed Central Google Scholar
Morris, D. L. et al. Genome-wide association meta-analysis in Chinese and European individuals identifies ten new loci associated with systemic lupus erythematosus. Nat. Genet. 48, 940–946 (2016).
Article CAS PubMed PubMed Central Google Scholar
Baurecht, H. et al. Genome-wide comparative analysis of atopic dermatitis and psoriasis gives insight into opposing genetic mechanisms. Am. J. Hum. Genet. 96, 104–120 (2015).
Article CAS PubMed Google Scholar
Patrick, M. T. et al. Causal relationship and shared genetic loci between psoriasis and type 2 diabetes through trans-disease meta-analysis. J. Invest Dermatol. 141, 1493–1502 (2021).
Article CAS PubMed Google Scholar
Laufer, V. A. et al. Genetic influences on susceptibility to rheumatoid arthritis in African-Americans. Hum. Mol. Genet. 28, 858–874 (2019).
Article ADS CAS PubMed Google Scholar
Chen, M. H. et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. Cell 182, 1198–1213.e14 (2020).
Article CAS PubMed Google Scholar
Langefeld, C. D. et al. Transancestral mapping and genetic load in systemic lupus erythematosus. Nat. Commun. 8, 16021 (2017).
Article ADS CAS PubMed Google Scholar
Yin, X. et al. Meta-analysis of 208370 East Asians identifies 113 susceptibility loci for systemic lupus erythematosus. Ann. Rheum. Dis. 80, 632–640 (2021).
Article CAS PubMed Google Scholar
Ishigaki, K. et al. Large-scale genome-wide association study in a Japanese population identifies novel susceptibility loci across different diseases. Nat. Genet. 52, 669–679 (2020).
Article CAS PubMed Google Scholar
Ha, E., Bae, S. C. & Kim, K. Large-scale meta-analysis across East Asian and European populations updated genetic architecture and variant-driven biology of rheumatoid arthritis, identifying 11 novel susceptibility loci. Ann. Rheum. Dis. 80, 558–565 (2021).
Article CAS PubMed Google Scholar
Lessard, C. J. et al. Variants at multiple loci implicated in both innate and adaptive immune responses are associated with Sjogren’s syndrome. Nat. Genet. 45, 1284–1292 (2013).
Article CAS PubMed Google Scholar
Watanabe, K., Taskesen, E., van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).
Article ADS PubMed Google Scholar
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).
Article CAS PubMed PubMed Central Google Scholar
Schmiedel, B. J. et al. Impact of genetic polymorphisms on human immune cell gene expression. Cell 175, 1701–1715.e16 (2018).
Article CAS PubMed PubMed Central Google Scholar
Breuer, K. et al. InnateDB: systems biology of innate immunity and beyond-recent updates and continuing curation. Nucleic Acids Res. 41, D1228–D1233 (2013).
Article CAS PubMed Google Scholar
Li, X. et al. Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nat. Genet. 52, 969–983 (2020).
Article CAS PubMed PubMed Central Google Scholar
Li, Z. et al. A framework for detecting noncoding rare variant associations of large-scale whole-genome sequencing studies. Nat. Methods 19, 1599–1611 (2021).
Zhou, H. et al. FAVOR: functional annotation of variants online resource and annotator for variation across the human genome. Nucleic Acids Res. 6, D1300–D1311 (2022).
Sun, R. et al. Integration of multiomic annotation data to prioritize and characterize inflammation and immune-related risk variants in squamous cell lung cancer. Genet. Epidemiol. 45, 99–114 (2021).
Article CAS PubMed Google Scholar
Li, X. et al. A multi-dimensional integrative scoring framework for predicting functional variants in the human genome. Am. J. Hum. Genet. 109, 446–456 (2022).
Article CAS PubMed PubMed Central Google Scholar
Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).
Article CAS PubMed PubMed Central Google Scholar
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Article CAS PubMed PubMed Central Google Scholar
Wallace, C. Eliciting priors and relaxing the single causal variant assumption in colocalisation analyses. PLoS Genet. 16, e1008720 (2020).
Article CAS PubMed PubMed Central Google Scholar
Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).
Article CAS PubMed Google Scholar
Huang da, W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).
Article PubMed Google Scholar
Sherman, B. T. et al. DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update). Nucleic Acids Res. 50, W216–W221 (2022).
Article PubMed PubMed Central Google Scholar
Guney, E., Menche, J., Vidal, M. & Barabasi, A. L. Network-based in silico drug efficacy screening. Nat. Commun. 7, 10331 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Denoth, L. et al. Modulation of the mucosa-associated microbiome linked to the PTPN2 risk gene in patients with primary sclerosing cholangitis and ulcerative colitis. Microorganisms 9, 1752 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ellinghaus, D. et al. Genome-wide association analysis in primary sclerosing cholangitis and ulcerative colitis identifies risk loci at GPR35 and TCF4. Hepatology 58, 1074–1083 (2013).
Article CAS PubMed Google Scholar
Aranake-Chrisinger, J., Dassopoulos, T., Yan, Y. & Nalbantoglu, I. Primary sclerosing cholangitis associated colitis: characterization of clinical, histologic features, and their associations with liver transplantation. World J. Gastroenterol. 26, 4126–4139 (2020).
Article CAS PubMed PubMed Central Google Scholar
Bastida, G. & Beltrán, B. Ulcerative colitis in smokers, non-smokers and ex-smokers. World J. Gastroenterol. 17, 2740–2747 (2011).
Article PubMed PubMed Central Google Scholar
Aune, D., Sen, A., Norat, T., Riboli, E. & Folseraas, T. Primary sclerosing cholangitis and the risk of cancer, cardiovascular disease, and all-cause mortality: a systematic review and meta-analysis of cohort studies. Sci. Rep. 11, 10646 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Lee, J., Taneja, V. & Vassallo, R. Cigarette smoking and inflammation: cellular and molecular mechanisms. J. Dent. Res. 91, 142–149 (2012).
Article CAS PubMed PubMed Central Google Scholar
Rodríguez, É. G. & Morán, G. A. G. in Autoimmunity: From Bench to Bedside (eds Anaya J. M. et al.) Ch. 8 (El Rosario University Press, 2013). https://www.ncbi.nlm.nih.gov/books/NBK459469/.
Poonawala, A., Nair, S. P. & Thuluvath, P. J. Prevalence of obesity and diabetes in patients with cryptogenic cirrhosis: a case-control study. Hepatology 32, 689–692 (2000).
Article CAS PubMed Google Scholar
Tana, M. M. et al. The significance of autoantibody changes over time in primary biliary cirrhosis. Am. J. Clin. Pathol. 144, 601–606 (2015).
Article CAS PubMed Google Scholar
Reyes, J. L. et al. Neutralization of IL-15 abrogates experimental immune-mediated cholangitis in diet-induced obese mice. Sci. Rep. 8, 3127 (2018).
Article ADS PubMed PubMed Central Google Scholar
Ludvigsson, J. F., Bergquist, A., Montgomery, S. M. & Bahmanyar, S. Risk of diabetes and cardiovascular disease in patients with primary sclerosing cholangitis. J. Hepatol. 60, 802–808 (2014).
Article PubMed Google Scholar
Suraweera, D., Fanous, C., Jimenez, M., Tong, M. J. & Saab, S. Risk of cardiovascular events in patients with primary biliary cholangitis—systematic review. J. Clin. Transl. Hepatol. 6, 119–126 (2018).
Article PubMed PubMed Central Google Scholar
de Vries, E. M. et al. Alkaline phosphatase at diagnosis of primary sclerosing cholangitis and 1 year later: evaluation of prognostic value. Liver Int. 36, 1867–1875 (2016).
Article PubMed Google Scholar
Iravani, S. et al. An update on treatment options for primary sclerosing cholangitis. Gastroenterol. Hepatol. Bed Bench 13, 115–124 (2020).
PubMed PubMed Central Google Scholar
Rahimpour, S. et al. A triple blinded, randomized, placebo-controlled clinical trial to evaluate the efficacy and safety of oral vancomycin in primary sclerosing cholangitis: a pilot study. J. Gastrointestin Liver Dis. 25, 457–464 (2016).
Article PubMed Google Scholar
Chapman, R. et al. Diagnosis and management of primary sclerosing cholangitis. Hepatology 51, 660–678 (2010).
Article CAS PubMed Google Scholar
Poch, T. et al. Single-cell atlas of hepatic T cells reveals expansion of liver-resident naive-like CD4(+) T cells in primary sclerosing cholangitis. J. Hepatol. 75, 414–423 (2021).
Article CAS PubMed Google Scholar
Rueger, S., McDaid, A. & Kutalik, Z. Evaluation and application of summary statistic imputation to discover new height-associated loci. PLoS Genet. 14, e1007371 (2018).
Article PubMed Google Scholar
Emdin, C. A. et al. Association of genetic variation with cirrhosis: a multi-trait genome-wide association and gene-environment interaction study. Gastroenterology 160, 1620–1633.e13 (2021).
Article CAS PubMed Google Scholar
Ong, J. S. et al. Multitrait genetic association analysis identifies 50 new risk loci for gastro-oesophageal reflux, seven new loci for Barrett’s oesophagus and provides insights into clinical heterogeneity in reflux diagnosis. Gut 71, 1053–1061 (2022).
Liu, L. et al. Twelve new genomic loci associated with bone mineral density. Front Endocrinol. (Lausanne) 11, 243 (2020).
Article CAS PubMed Google Scholar
Chandan, J. S. & Thomas, T. The impact of inflammatory bowel disease on oral health. Br. Dent. J. 222, 549–553 (2017).
Article CAS PubMed Google Scholar
Jiang, L., Zheng, Z., Fang, H. & Yang, J. A generalized linear mixed model association tool for biobank-scale data. Nat. Genet. 53, 1616–1621 (2021).
Article CAS PubMed Google Scholar
Wishart, D. S. et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46, D1074–D1082 (2018).
Article CAS PubMed Google Scholar
Consortium, G. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Article Google Scholar
Dennis, G. Jr. et al. DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 4, P3 (2003).
Article PubMed Google Scholar

Download references

Acknowledgements

We thank all individuals who have contributed their samples and clinical data for the PSC study, and we also thank the international PSC study group for sharing GWAS summary statistics of PSC. We want to acknowledge the participants and investigators of the FinnGen study. The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The data used for the analyses described in this manuscript were obtained from the GTEx Portal on 10/25/2021. Our study was supported by NIH/NCI under award P50 CA210964, by the Cholangiocarcinoma Foundation, and by PSC Partners Seeking a Cure to L.R.R.. C.I.A. is a Research Scholar of the Cancer Prevention Research Interest of Texas (CPRIT) award RR170048. J.Ra. was partially supported by NHLBI under award K25 HL152006 and by Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity (AIM-AHEAD) award OD032581-01S1.

Author information

These authors contributed equally: Younghun Han, Jinyoung Byun.
A full list of members and their affiliations appears in the Supplementary Information.

Authors and Affiliations

Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA
Younghun Han, Jinyoung Byun, Catherine Zhu, Vikram R. Shaw & Christopher I. Amos
Section of Epidemiology and Population Sciences, Department of Medicine, Baylor College of Medicine, Houston, TX, USA
Younghun Han, Jinyoung Byun & Christopher I. Amos
Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, USA
Jinyoung Byun & Christopher I. Amos
Department of Biostatistics, University of Texas, M.D. Anderson Cancer Center, Houston, TX, USA
Ryan Sun
Department of Pharmacy, Ochsner Health, New Orleans, LA, USA
Julia Y. Roh
Population Health Sciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
Heather J. Cordell
David J. Sugarbaker Division of Thoracic Surgery, Michael E. DeBakey Department of Surgery, Baylor College of Medicine, Houston, TX, USA
Hyun-Sung Lee & Sung Wook Kang
VA HSR&D, Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey VA Medical Center, Houston, TX, USA
Javad Razjouyan
Big Data Scientist Training Enhancement Program (BD-STEP), VA Office of Research and Development, Washington, DC, USA
Javad Razjouyan
Department of Medicine, Baylor College of Medicine, Houston, TX, USA
Javad Razjouyan
VA Quality Scholars Coordinating Center, IQuESt, Michael E. DeBakey VA Medical Center, Houston, TX, USA
Javad Razjouyan
Mayo Clinic Graduate School of Biomedical Sciences, Mayo Clinic, Rochester, MN, USA
Matthew A. Cooley
Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Manal M. Hassan
Departments of Medicine, Immunology and Medical Sciences, University of Toronto, Toronto, Ontario, Canada
Katherine A. Siminovitch
Mount Sinai Hospital, Lunenfeld-Tanenbaum Research Institute and Toronto General Research Institute, Toronto, Ontario, Canada
Katherine A. Siminovitch
Norwegian PSC Research Center, Oslo University Hospital Rikshospitalet, Oslo, Norway
Trine Folseraas
Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel, Germany
David Ellinghaus & Andre Franke
Department of Medicine Huddinge, Unit of Gastroenterology and Rheumatology, Karolinska Institutet, Karolinska University Hospital, Stockholm, Sweden
Annika Bergquist
Department of Gastroenterology, Norfolk and Norwich University Hospital, Norwich, United Kingdom
Simon M. Rushbrook
Norwich Medical School, University of East Anglia, Norfolk, United Kingdom
Simon M. Rushbrook
Oslo University Hospital Rikshospitalet and University of Oslo, Oslo, Norway
Tom H. Karlsen & Lewis R. Roberts
Division of Gastroenterology and Hepatology, Department of Internal Medicine, Mayo Clinic, Rochester, MN, USA
Konstantinos N. Lazaridis
Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
Katherine A. McGlynn
1st Department of Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
Christoph Schramm
4350 La Jolla Village Drive Suite 960, San Diego, CA, USA
David Shapiro
Academic Department of Medical Genetics, University of Cambridge, Cambridge, UK
Elizabeth Goode

Authors

Younghun Han
View author publications
You can also search for this author in PubMed Google Scholar
Jinyoung Byun
View author publications
You can also search for this author in PubMed Google Scholar
Catherine Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Ryan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Julia Y. Roh
View author publications
You can also search for this author in PubMed Google Scholar
Heather J. Cordell
View author publications
You can also search for this author in PubMed Google Scholar
Hyun-Sung Lee
View author publications
You can also search for this author in PubMed Google Scholar
Vikram R. Shaw
View author publications
You can also search for this author in PubMed Google Scholar
Sung Wook Kang
View author publications
You can also search for this author in PubMed Google Scholar
Javad Razjouyan
View author publications
You can also search for this author in PubMed Google Scholar
Matthew A. Cooley
View author publications
You can also search for this author in PubMed Google Scholar
Manal M. Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Katherine A. Siminovitch
View author publications
You can also search for this author in PubMed Google Scholar
Trine Folseraas
View author publications
You can also search for this author in PubMed Google Scholar
David Ellinghaus
View author publications
You can also search for this author in PubMed Google Scholar
Annika Bergquist
View author publications
You can also search for this author in PubMed Google Scholar
Simon M. Rushbrook
View author publications
You can also search for this author in PubMed Google Scholar
Andre Franke
View author publications
You can also search for this author in PubMed Google Scholar
Tom H. Karlsen
View author publications
You can also search for this author in PubMed Google Scholar
Konstantinos N. Lazaridis
View author publications
You can also search for this author in PubMed Google Scholar
Katherine A. McGlynn
View author publications
You can also search for this author in PubMed Google Scholar
Lewis R. Roberts
View author publications
You can also search for this author in PubMed Google Scholar
Christopher I. Amos
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

The International PSC Study Group

Christoph Schramm
, David Shapiro
& Elizabeth Goode

Contributions

Y.H., J.B., and C.I.A. conceived and designed the study; Y.H. prepared and curated data; Y.H. and J.B. carried out the analyses and wrote the first draft of the manuscript; R.S. performed multi-omic annotation analysis; J.Y.R. assisted the description of results from drug repositioning analysis; C.Z., H.J.C., H.L., S.W.K., J.Ra., V.R.S., M.A.C., M.M.H., K.A.M., and L.R.R., C.I.A. contributed to interpretation of the results; T.F., D.E., A.B., S.M.R., A.F., T.H.K., K.N.L., and IPSCSG provided the summary statistics of PSC GWAS; H.J.C. and K.A.S. provided the summary statistics of PBC GWAS; Y.H., J.B., and C.I.A. supervised the study; all authors provided critical feedback and revised the manuscript for important intellectual content.

Corresponding author

Correspondence to Christopher I. Amos.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Supplementary Data 8

Supplementary Data 9

Supplementary Data 10

Supplementary Data 11

Supplementary Data 12

Supplementary Data 13

Supplementary Data 14

Supplementary Data 15

Supplementary Data 16

Supplementary Data 17

Supplementary Data 18

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Han, Y., Byun, J., Zhu, C. et al. Multitrait genome-wide analyses identify new susceptibility loci and candidate drugs to primary sclerosing cholangitis. Nat Commun 14, 1069 (2023). https://doi.org/10.1038/s41467-023-36678-8

Download citation

Received: 25 May 2022
Accepted: 10 February 2023
Published: 24 February 2023
DOI: https://doi.org/10.1038/s41467-023-36678-8

This article is cited by

Proteome-wide Mendelian randomization highlights AIF1 and HLA-DQA2 as targets for primary sclerosing cholangitis
- Lanlan Chen
- Yuexuan Zhao
- Guoyue Lv
Hepatology International (2024)
ANXA1 is identified as a key gene associated with high risk and T cell infiltration in primary sclerosing cholangitis
- Jian Zhang
- Huiwen Wang
- Shifang Peng
Human Genomics (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

PSC shows the shared genetic contributions among numerous clinical and epidemiological traits

MTAG with immune-mediated diseases identifies new PSC-associated loci with evidence of replication

Fine-mapping and functional annotation nominates candidate variants within MTAG-identified loci

eQTL-based colocalization prioritizes PSC susceptibility genes from the MTAG-identified new loci

Network-based proximity predicts drug-PSC associations for drug repurposing

Discussion

Methods

Ethics statement

GWAS summary statistics and imputation

Analyses of multitrait GWAS

Characterization of genomic risk loci using FUMA

Functional annotation within immune-related genes using InnateDB Innate Immunity Genes

Integrative multi-omic annotation analysis

Annotation-informed function prediction

Fine-mapping and gene-based enrichment analyses

Network-based proximity between drugs and disease-identified proteins for drug repurposing

Reporting summary

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Consortia

The International PSC Study Group

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links