Sunday, September 20, 2020

GWAS Meta-analyis for Bipolar Disorder Gives Glowing Analysis, but is impossible to Interpret (Again)

 A brief review of this GWAS for Bipolar Disorder:

Genome-wide association study of over 40,000 bipolar disorder cases provides novel biological insights (Mullins et al. )

Like almost all the behavioral genetic GWAS studies, this one uses a meta-analysis, despite having new data added to previous data and the new data was never assessed (at least in print) independently. Thus, it is difficult to assess statistically what is success and what is failure, although it is filled with the usual accolades:

This GWAS provides the best-powered BD polygenic scores to date, when applied in both European and diverse ancestry samples. Together, these results advance our understanding of the biological etiology of BD, identify novel therapeutic leads and prioritize genes for functional follow-up studies.

 Well, the best and the only, really. But, of course, I have a lot of questions. The first is related to their significant loci count, and for which I needed partial clarification from one of the authors, as I will discuss after the fold (click "read more" to continue).

First, here's there result:

Bipolar disorder (BD) is a heritable mental illness with complex etiology. We performed a genome-wide association study (GWAS) of 41,917 BD cases and 371,549 controls, which identified 64 associated genomic loci.

Of these 64 loci, 33 were novel (not previously meeting significance).  This, of course, brings up a few questions.  A table provided lists all the loci, as well as previous studies that had the same loci. This might suggest some kind of replication of previous studies, but that is not the case, because this was a meta-analysis and the data used to identify those was also used in this meta-analysis. I wanted to find out if any loci from the previous study, did not meet significance, despite being included in this study. With some Twitter prodding, I got a partial answer from the lead author:

Of the 30 loci from PGC2, 26 were GW sig in PGC3 and 2 reached suggestive sig. There are also some loci from other GWAS that replicated (as shown in the table), but I didn't do a comprehensive look up of all loci that were GW sig in previous studies. Hope that helps!

Seems maybe a tiny bit boastful, so do we have something to boast about here? Well, it's hard to tell, because we have no independent sample for comparison (again). If half the data from the meta-analysis was from the study that found 30 significant loci, how many loci would we expect to lose significance? This might be a mathematically solvable, but is obviously not really even addressed in the study. And, contrary, to the point that "2 reached suggested sig.," we have what I think is a statistical misunderstanding. Leaving aside whether you can even say "suggestive significance" (presumably meaning a not quite low enough p value), this would only be the case if we were looking at a new, independent sample. Thus, if we have a loci that meets significance and then, with data added, no longer meets significance, you are talking about something far different than doing an independent GWAS and that loci "almost" reaches significance. The question of course, is why not do an independent GWAS on the new data for comparison's sake? I have been asking that question for every major behavioral genetics GWAS that has come out in the past couple of years. I think the answer is obvious.

Moreover, one might want to look at the change in p-values for those that didn't drop below significance. If you are really finding actual genetic correlations, adding data should make your p-values smaller (better). I have yet to see what happened in this case, and will add an addendum when I do, but I suspect that most p values will be a little bit watered down from their previous significant p value, suggesting that the significant correlation was possibly a false positive that will fade the more data is added.

As Dr. Mullins notes, then, we have other loci that may or may not have reached significance. However, if 33 were novel and 26 were from the previous study, we know that 5 other loci reached significance that were in previous studies (for which the data was used in this meta-analysis). It is trickier to determine how many loci from these previous studies did not reach significance and I can only guess, because we are often dealing with overlapping datasets. There were at least 10 - 15, by my rough count, so that means at least half (I assume more) did not reach significance in this study despite presumably being included in the meta-analysis, which one would think should bolster their significance.

Another problem you have with this kind of overlapping is that you can get different loci depending on which datasets you put together. For example, what Dr. Mullins refers to above as PGC2, is actually a combined analysis that got the 30 loci. Initially, they used a discovery sample and combined it. This is an interesting point from the abstract of that paper:

 Eight of the 19 variants that were genome-wide significant (P< 5 × 10−8) in the discovery GWAS were not genome-wide significant in the combined analysis, consistent with small effect sizes and limited power but also with genetic heterogeneity. In the combined analysis, 30 loci were genome-wide significant, including 20 newly identified loci. 

Thus, you can generate variants that then don't replicate when expanding your data-set. It really leaves the situation muddled. I imagine you could produce hundreds of "significant" loci just by repetitively recombining these datasets and performing GWAS. Therein lies the problem with overlapping datasets.  It's worth noting that this is more consistent with false positives than a presumptive "limited power and genetic heterogeneity."

It's worthwhile also to point out the claims of practical applications for the results. In particular: 

 BD risk alleles were enriched in genes in synaptic and calcium signaling pathways and brainexpressed genes, particularly those with high specificity of expression in neurons of the prefrontal cortex and hippocampus. Significant signal enrichment was found in genes encoding targets of antipsychotics, calcium channel blockers and antiepileptics. Integrating eQTL data implicated 15 genes robustly linked to BD via gene expression, including druggable genes such as HTR6, MCHR1, DCLK3 and FURIN.

If you already know, from previous studies of this nature, that many of the loci you find are not going to hold up, then how wise is it to put resources into drug research for something that is very likely a false positive?  Moreover, statements about enrichment which also imply "evidence" of validity for the loci, seem to be different from study to study of the same trait, focusing on different parts of the brain and different functions. Is anyone notified that they have been wasting their time?

Let me conclude. This study is not useful without an independent assessment of the new data for comparison purposes. Loci should not "come and go" with each new study, particularly when you even use some of the same data. If we had real loci (SNP's) that were related to bipolar disorder, they would become more and more significant with the addition of more data. What is probably happening here, is that adding data produces more false positives, that cannot hold up as more data is added, and so on, even though they most sure benefit from some population stratification.  This failure of behavioral genetics GWAS to reproduce their results has created a kind of maze of assumptions about complex traits, like a Tower of Babel. 


 

 

 


No comments:

Post a Comment