Thursday, April 26, 2018

Another Depression GWA/Meta-Analysis claims 44 risk variants for Major Depressive Disorder


Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depressive disorder

After only a couple of weeks and, now, my 6th critique of a genetic study, I once again have the same issue before I even get started on the study:  There is no randomized control.
I have discussed here previously, why I believe GWAS studies are very prone to false positives, which you can read here,  and I have also proposed my own test for determining the false positive rate for any particular GWAS, which you can find here, so it might be worthwhile to check those out before reading the rest of this.  I have contacted one of the authors and will try to contact others to address this issue, but, as yet, no one has responded to me regarding this or any of my other critiques.  I am not dissuaded...

This study uses 7 different cohorts in what is described as a meta-analysis related to Major Depressive Disorder.  For the purposes of this critique, I am not going to quibble about differences between the study protocols and diagnostic methods of each of these studies, as it is not primary to my specific criticisms.  My primary criticism, once again, is that there is no way to reasonably assess by their study protocol, whether any of the 44 risk variants they describe are anything more than random false positives.  Lets look at their numbers a bit:
1. "We conducted a genome-wide association (GWA) meta-analysis in 130,664 MDD cases and 330,470 controls, and identified 44 independent loci that met criteria for statistical significance."
2. "We completed a GWA meta-analysis of 9.6 million imputed SNPs in seven cohorts...

So, those are some big numbers.  By comparison, maybe 44 loci are not all that many.  While it's true that these met statistical significance, might we expect some outlier, low p-values just by chance?
Are they suggesting that all 44 of these loci are definitively correlated to MDD simply because they had a low p-value, and that a random sample (using the method I link to above) would provide ZERO significant loci?  Does that ring true for anyone?
I am going to suggest that, if they took at random, 130,664 people from the study and designated them as CASE and took the other randomly generated 330,470 and designated them as CONTROLS, and performed the exact same GWAS protocol, that we might find a similar number of "significant" correlations by chance alone (one could even repeat this a few times to get a more exact idea of the expected number of randoms).
If the randomized method showed only, say, 10 or less, then we would at least be confident that, whatever else, we had something more than random with our original 44 variants.  However, as that number increases past, say 25, then we should start to assume that our original GWAS largely (if not entirely) captured random, false positive variants.
Like many of these studies, the authors do attempt to confirm their GWAS is actually flagging loci related to MDD, by a few methods that I will address shortly, but this is very approximate in nature.  Moreover, if our random sample produced say, 20, at best we are dealing with 20 to 25 valid loci, so this analysis can be limited in a number of ways, since you are comparing a lot of false positives mixed in with valid data.
So the question I have at this point, is should I even bother discussing the rest of the study, or wait for them to do a random control and take it from there? [I know - it ain't going to happen].   I'm tempted to just sign off here, but I will try to address some things related to their analysis, under the assumption that we are looking at a (largely) random dataset, since that is what I believe we have here.

So let's continue with their stated successful results:
"Our meta-analysis of seven MDD cohorts identified 44 independent loci that were statistically significant (P<5x10-8 ), statistically independent of any other signal, 25 supported by multiple SNPs, and showed consistent effects across cohorts. This number is consistent with our prediction that MDD GWA discovery would require about five times more cases than for schizophrenia (lifetime risk ~1% and ℎ"~0.8) to achieve approximately similar power. 26 Of these 44 loci, 30 are novel and 14 were significant in a prior study of MDD or depressive symptoms.

Another way to say this is that 30 of the 44 Loci found had not been previously noted in Major Depressive Disorder (MDD) studies.  There seems to be a tendency to point to the increased size of these studies as the basis for finding more loci.  There are, as I've pointed out previously in other study critiques, limits to this line of thinking.  One should expect that there are a limited total number of SNP's (single nucleotide polymorphisms) related to a trait such as MDD.  If they keep finding new loci with each bigger study, then there is an assumption that they just haven't found them all yet and need bigger and bigger databases to draw from.  There is, however, another possibility, which is that the increases in data size just keep increasing the number of false positives.  Perhaps we are not uncovering new SNP's, but simply cranking out new false positives, which we will continue to do as long as we can increase our databases, which seems likely for the future, to be sure.
The second part of the above paragraph from the study was, on first glance, quite impressive from my perspective.  Although I would expect more loci to match from study to study, I hadn't seen 14 out of 40 being "replicated" and that would suggest something more than random in my view.  And, it appears, it is far from random, because the bulk of these "replicated" loci came from a 23andMe database that was also used in the meta-analysis.  The copy I have of the paper does not appear to provide numbers for each of the studies, but I am going to guess that the 23andMe database they used in the meta-analysis is the largest and, therefore, is going to influence the final results of this study to a greater extent.  I welcome some response from the authors.  The other couple of "replicated" loci also come from studies that are included in this meta-analysis.  To be fair, I am not suggesting that the authors were trying to pull a fast one.  I'm just pointing out that statistically, it would not be surprising that some of the loci found in each of these individual studies would carry significance after being watered down by the other studies.  That, is, in fact consistent with my assertion that most or all of these loci are random false positives.
Next the authors make a bit of a case for some of the loci they found being related to neurological functions, implying that this is more evidence that they are related to depression which they assume has a strong neurological basis.  Let's look at what they say:
 "In nine of the 44 loci, the lead SNP is within a gene, there is no other gene within 200 kb, and the gene is known to play a role in neuronal development, synaptic function, transmembrane adhesion complexes, and/or regulation of gene expression in brain."
If I understand all the points, they are saying that some of the correlated loci (9 out of 44) contain a gene, which they think is the most likely gene to be the culprit in their association for depression and these genes, at least in part, are involved in the neurological functions named above.  Is that an impressive finding?  I honestly can't say for sure.  It doesn't really sound that impressive to me.  They aren't even sure about the 9, the genes they note aren't exclusively related to neurological function and that still leaves 35 genes that they can't even link to any neurological function.
I will again suggest, that such a finding could be made more (or less) impressive if they had done a random control as I suggested above and any false positives found in the control were not related to neurological functions.
The study then spends a few paragraphs discussing the various presumed SNP's and their functions.  This is actually interesting in its own right, but it feels like whatever their stated function, they try too hard to connect it to MDD and some of them are a stretch.  I'll give an example:
"LRFN5 also limits Tcell response and neuroinflammation (CNS “immune privilege”) by binding to herpes virus entry mediator; a LRFN5-specific monoclonal antibody increases activation of microglia and macrophages by lipopolysaccharide and exacerbates mouse experimental acquired encephalitis; "
I find this highly speculative and impossible to rebut or agree with.
The next part of the study is a comparison to loci of schizophrenia.  I'm not sure why they decided to do such a comparison specifically for this diagnosis and whether they compared other diagnoses that didn't make final paper.  I'll quote here:

Finally, comparison of the MDD loci with 108 loci for schizophrenia  identified six shared loci. Many SNPs in the extended MHC region are strongly associated with schizophrenia, but implication of the MHC region is novel for MDD. Another example is TCF4 (transcription factor 4) which is strongly associated with schizophrenia but not previously with MDD. 
I could say a lot about this, but for now will just point out that they seem to match more loci between an MDD study and a schizophrenia study than they do for any two MDD studies.  That obviously doesn't make sense and I'll just leave it at that for now.

Next is a section entitled:
"Implications for the biology of MDD using functional genomic data"
Most of the GWAS studies I've looked at thus far have a section of this nature.  They take the presumed function of the correlated SNP's  and develop some theories as to the mechanism of, in this case, MDD.   If you follow along from study to study, each will have almost entirely different mechanisms, since they found entirely different SNP's than the previous MDD studies.  Again, I will assume that this is due to these being false positives which, as human beings, we find a way to relate to what we are trying to demonstrate.  That said, kudos to the authors for the level and breadth of information provided.

Also, let me address this quickly:
"Genetic studies can now offer complementary strategies to assess whether a phenotypic association between MDD and a risk factor or a comorbidity is mirrored by a non-zero &' (common variant genetic correlation) and, for some of these, evaluate the potential causality of the association given that exposure to genetic risk factors begins at conception. "

For this to have any validity (at least in my mind), we would need to know whether the genes we are looking at are actually related to MDD in any real way.  At that point, we can start trying to assess causality.  The idea that multiple genes contribute to MDD and other traits is speculative.  It is based on the assumption that the disorders are genetic because of high heritability, but with a failure to find any specific genes that confer a significant effect.  No specific mechanism of this sort has been mapped out for any mental disorder and I question the entire premise, especially when it would be difficult to explain how such a mechanism could confer high heritability in the first place.

There is more speculation in this study and if the authors want to comment on it in the comment section in response to what I have written, I will be happy to hear what they have to say.  I do take issue with one particular conclusion, however:

"The nature of severe depression has been discussed for millennia.  This GWA meta-analysis is among the largest ever conducted for a psychiatric disorder, and provides a body of results that help refine and define the fundamental basis of MDD. First, MDD is a brain disorder. Although this is not unexpected, some past models of MDD have had little or no place for heredity or biology. Our results indicate that genetics and biology are definite pieces in the puzzle of MDD. "
Few would claim that the brain has no role in depression or that biology plays no role in major depression.  The question is causality.  A chicken and egg question.  For example, if I have a female patient that has been depressed for years and I come to find out that the patient's husband had molested their daughter for years and she is estranged from the family, then I believe I might have a specific cause for her depression.  No genes needed.  In this depressed state, her brain might be throwing off her hypothalamic-pituitary response and creating physical problems, etc.  However, for the reasons noted above, I do not think that the authors have effectively demonstrated that the genes they flag in this study have a role in depression and I encourage them to prove me wrong by comparing their results to a randomly generated result.

No comments:

Post a Comment