This critique is for the following study:
A combined analysis of genetically correlated traits identifies 187 loci and a role for neurogenesis and myelination in intelligence (Hill et al., 2017)
It is another GWAS/meta-analysis study. Anyone who is following my critiques on this blog will probably assume, correctly, that I am going to harp on the lack of a random control. I'll get to that, but wanted to make a few comments, first.
First, this meta-analysis draws heavily from two other meta-analyses that I have already critiqued previously. One is for intelligence (Sniekers) and one is for educational attainment (Okbay). In fact, the Sniekers study also used the Okbay study for what it unironically referred to as a "proxy-replication." This is all getting a bit incestuous, for lack of a better word. I understand the desire to use large datasets (which I will also address shortly), but it adds a layer of what I would call false correlation between studies, since they are using a lot of the same data and prevents any reasonable claims of replication. The authors do remove the old data from the BioBank portion of the Sniekers study and use a new dataset for that, so there is at least some new data in this study.
I'll also say up front that I object to the use of studies related to different traits, in this case "educational attainment", to bolster the dataset size. I think this is creates an unnecessary convolution, regardless of whether these traits are considered closely correlated...
Before delving into the meat of this study in more detail, I also wanted to address a couple of assertions made, starting with this one:
Next, I'd like to comment on the stated push in a lot of these GWAS studies, including this one, to use larger and larger datasets:
Okay, I got a little sidetracked. The next part of the paragraph is the crux of the problem with these GWAS studies, in my view. There seems to be this assumption that larger and larger datasets should find more and more loci or SNP's related to the trait. I think that there should be a law of diminishing returns. If you have a finite number of genes somehow related to intelligence then, at a certain point, the total should level off. That is not what we are seeing, though. We are seeing ever-increasing numbers of significant loci or SNP's as we expand the data-set. To me, this suggests a different explanation: False Positives. The total number of loci that reach significance for a trait is largely based on the p value being low for that particular loci. One can assume that some of these p values will be low (showing high significance) just by chance. The number of significant p values that are due to chance are clearly going to increase based on the total number of loci you are looking at and the size of your case/control database. So if you find 50 loci with low p values that suggest significance, how many of those are random false positives? Is it one or two, which wouldn't be such a big deal, or is it closer to 50, which one might call the null value (and I am taking the null for this study).
As I have suggested, there is a method by which someone can get a good idea of the percentage of false positives: Randomize the initial datasets, using the same case/control ratios and do a repeat of your GWAS. Any positive correlations you see in that instance are going to be false positives (I talk more specifically about my suggested approach here.) If you repeated this a few times, you would have a fairly precise indication of how many false positives you could expect. I first suggested this approach in a letter to the British Journal of Psychiatry in 2002 and I hope someone will finally take me up on it.
Moving on to the study procedure, I've already mentioned that I object to the inclusion in the meta-analysis of a study of another trait (educational attainment), but will include their rationale here:
Is this what one would expect for a study looking at intelligence genes when you increase the n in part by throwing in educational attainment data? Are these "significant" loci really significant? Are any of these 187 loci false positives? If educational attainment and intelligence are so closely linked, why are we only seeing 74 loci for educational attainment, when they had a larger dataset? Well, I don't think many of these questions are directly answerable beyond speculation, but we would have a much better idea if they had tested this against a randomized case/control as I suggested above. Perhaps this is still possible and, if so, I would encourage the authors to do so.
What, at first glance, seemed like an interesting finding was the large number of loci found for the trait in comparison to the number found in the Okbay educational attainment study. Since their stated datasets are similar in size, I would have expected a more similar number of significant loci, whether these were random false positives or closely related genetic mechanisms. On the surface, that suggests something more than random, which in itself would be odd when you are making the case that educational attainment and intelligence are so closely related. Then I looked more into this "MTAG" process. Let me quote a paper related to that:
That makes it only slightly less ridiculous. Going back to our study at hand, I have several objections to the use of MTAG, this new and unproven mathematical manipulation:
1. The validity of MTAG should be assessed independently of performing new studies. In other words, one should apply this method to old studies and assess whether the new loci discovered are valid. It appears that this has never been done. The assumption in the study above introducing MTAG (then requiring a correction), was that these new loci are valid. In this current study, there was a mixture of old and new datasets, making it impossible to compare this method to standard methods. There is simply no way to assess the validity of the results garnered from using MTAG in this study.
2. The authors never do a simple GWAS of the new BioBank data for reference.
3. It is difficult enough to assess the validity of studies like this on their own. Introducing new, largely unproven, mathematical concepts leaves more room for error.
4. We have no random control. I'm not sure if it would even be possible, but perform this same MTAG on a randomized dataset using the same "Case/Control" ratios. I have already pointed out that two of the studies used in the meta-analysis (Sniekers and Okaby) also suffer from a lack of random controls. So you are taking data from studies that have not adequately proven that there results were better than random, performing an unproven mathematical construct (MTAG) on the combined data and claiming that the results (187 significant loci) are not due to random false positives without comparing it to a randomized dataset.
5. Appealing to a common sense level: If you are taking old studies and manipulating them to find more positive results, perhaps you are engaging in some wishful thinking.
Therefore, I challenge the authors of all three of these studies to compare their results to randomized controls (preferably starting with Okbay and Sniekers). I believe the author of this study could perform all three of these randomized controls and end this madness.
I'd stop here, but I'll make a couple of other points, so that it doesn't seem like I'm ducking potentially positive findings. The authors suggest a bit of replication:
Well, that would sure be an interesting finding for two independent datasets, but they are using the Sniekers study data in this study, so you should expect many of the loci to remain significant. Moreover, I suspect that the MTAG manipulation increases this carryover, although I wouldn't know how to calculate that likelihood (can anyone?).
The next part of the study first involves another mathematical manipulation referred to as MAGMA, effectively taking their 187 loci and producing 538 specific genes. I am not going to go after MAGMA specifically, because this is getting exhausting, but I will address the stated purpose for doing this, which is to show that neurologically based genes are more prominent in this collection of genes. Once again, this is difficult to demonstrate without a random control, which might also point at neurologically based genes. It would require two blinded groups assessing the results of these. I suspect that the group looking at the random control would then find as many neurological correlates to the random false positives that they are examining.
The next part of the study discusses a bipolar disorder and schizophrenia in relation to intelligence:
Reading this from my perspective, it seems to show randomness. Educational attainment and intelligence are supposed to be closely genetically linked, but this seems like a chasm of distinction. Moreover, it is counterintuitive. People with serious mental disorders are more likely to have high educational attainment than high intelligence?
Here's another curious passage:
When each GWAS you perform seems to give different mechanisms and correlations, I think it's again worth considering whether we are just looking at random correlations and drawing misguided conclusions that will be negated by the next GWAS that comes around the block.
In Conclusion:
1. Two of the original studies used in this meta-analysis did not demonstrate that they are anything more than random false positives.
2. The use of MTAG is dubious. It is unproven and appears to be an attempt to artificially inflate the n and get more positive loci.
3. The large number of positive loci garnered by this meta-analysis, aided by MTAG, are not demonstrated to be anything more than random false positives and no control is provided for comparison.
4. The attempt to correlate these unproven genes to various neurological functions is not demonstrated to be any better than one could do if they had a random set of genes and wanted to correlate them to neurological functions.
5. The comparisons with diagnoses and other traits did not replicate with previous findings, in some cases appear counter intuitive, and suggest again that we may be working with random results.
First, this meta-analysis draws heavily from two other meta-analyses that I have already critiqued previously. One is for intelligence (Sniekers) and one is for educational attainment (Okbay). In fact, the Sniekers study also used the Okbay study for what it unironically referred to as a "proxy-replication." This is all getting a bit incestuous, for lack of a better word. I understand the desire to use large datasets (which I will also address shortly), but it adds a layer of what I would call false correlation between studies, since they are using a lot of the same data and prevents any reasonable claims of replication. The authors do remove the old data from the BioBank portion of the Sniekers study and use a new dataset for that, so there is at least some new data in this study.
I'll also say up front that I object to the use of studies related to different traits, in this case "educational attainment", to bolster the dataset size. I think this is creates an unnecessary convolution, regardless of whether these traits are considered closely correlated...
Before delving into the meat of this study in more detail, I also wanted to address a couple of assertions made, starting with this one:
"Intelligence, also known as general cognitive function or simply g, describes the shared variance that exists between diverse measures of cognitive ability."I don't mean to get semantically critical, but the word intelligence can mean a whole lot of things, depending on the context, one of which might be general cognitive function (which also could be interpreted in many ways, although I concede that scientists have established certain specific areas that they are referring to in the literature). So, I think it would be better to say this: For the purpose of this study, intelligence is defined as general cognitive function and is being quantified by g. If it seems like I'm splitting hairs, it's more important to me than it may be to others.
Next, I'd like to comment on the stated push in a lot of these GWAS studies, including this one, to use larger and larger datasets:
"Relatively few genetic variants have reliably been associated with intelligence differences. The sparsity of genome-wide significant SNPs discovered so far, combined with the substantial heritability estimate, suggests a phenotype with a highly polygenic architecture, where the total effect of all associated variants is substantial, but in which each individual variant exerts only a small influence. This is compelling evidence that the number of uncovered genome-wide significant loci associated with intelligence can be increased by raising the sample size—and thus the statistical power—of GWASs, as has been the case for other phenotypes such as height and schizophrenia."There is a lot I want to take on from this paragraph, starting with the first sentence, because I am seeing scientists, authors, and leaders in this field making claims as if there is good scientific evidence for the genetic basis of intelligence, although we know that isn't due to finding specific genes related to intelligence (I would suggest that "Relatively few" could be replaced with Zero). This leads to the second sentence, that generally follows in these arguments: There is no actual evidence of a highly polygenic architecture, but since it's heritable and we can find a gene, that must be the case. As I have pointed out previously in another post, a polygenic mechanism that conveys high heritability strikes me as mathematically implausible and I would be happy if anyone could provide even a theoretical model of how that could work.
Okay, I got a little sidetracked. The next part of the paragraph is the crux of the problem with these GWAS studies, in my view. There seems to be this assumption that larger and larger datasets should find more and more loci or SNP's related to the trait. I think that there should be a law of diminishing returns. If you have a finite number of genes somehow related to intelligence then, at a certain point, the total should level off. That is not what we are seeing, though. We are seeing ever-increasing numbers of significant loci or SNP's as we expand the data-set. To me, this suggests a different explanation: False Positives. The total number of loci that reach significance for a trait is largely based on the p value being low for that particular loci. One can assume that some of these p values will be low (showing high significance) just by chance. The number of significant p values that are due to chance are clearly going to increase based on the total number of loci you are looking at and the size of your case/control database. So if you find 50 loci with low p values that suggest significance, how many of those are random false positives? Is it one or two, which wouldn't be such a big deal, or is it closer to 50, which one might call the null value (and I am taking the null for this study).
As I have suggested, there is a method by which someone can get a good idea of the percentage of false positives: Randomize the initial datasets, using the same case/control ratios and do a repeat of your GWAS. Any positive correlations you see in that instance are going to be false positives (I talk more specifically about my suggested approach here.) If you repeated this a few times, you would have a fairly precise indication of how many false positives you could expect. I first suggested this approach in a letter to the British Journal of Psychiatry in 2002 and I hope someone will finally take me up on it.
Moving on to the study procedure, I've already mentioned that I object to the inclusion in the meta-analysis of a study of another trait (educational attainment), but will include their rationale here:
"In the present study, we combined these two approaches by using MTAG [26], a newly-developed technique that allows the meta-analysis of summary statistics from genetically-related traits. This enabled us effectively to increase the sample size (to add power) to GWASs of intelligence by adding in the genetic variance that is shared with proxy phenotypes. "I will discuss MTAG to the best of my ability, shortly, but, for the most part, my overall point is the same. I will cut to the chase: They got their dataset up in the 250,000 range (the previous study was 70,000) including adding in the educational attainment dataset. Oddly, the Okbay study had almost 300,000 by itself, so I will assume that the MTAG procedure in question lowers the effective n when combined with the current dataset (or that I am misreading this, or that they made an error). They came up with 187 significant loci. The previous study had 16 (or 18, depending on how you slice it). So they quadrupled the n and got 10 times the significant loci.
Is this what one would expect for a study looking at intelligence genes when you increase the n in part by throwing in educational attainment data? Are these "significant" loci really significant? Are any of these 187 loci false positives? If educational attainment and intelligence are so closely linked, why are we only seeing 74 loci for educational attainment, when they had a larger dataset? Well, I don't think many of these questions are directly answerable beyond speculation, but we would have a much better idea if they had tested this against a randomized case/control as I suggested above. Perhaps this is still possible and, if so, I would encourage the authors to do so.
What, at first glance, seemed like an interesting finding was the large number of loci found for the trait in comparison to the number found in the Okbay educational attainment study. Since their stated datasets are similar in size, I would have expected a more similar number of significant loci, whether these were random false positives or closely related genetic mechanisms. On the surface, that suggests something more than random, which in itself would be odd when you are making the case that educational attainment and intelligence are so closely related. Then I looked more into this "MTAG" process. Let me quote a paper related to that:
"We introduce Multi-Trait Analysis of GWAS (MTAG), a method for the joint analysis of summary statistics from GWASs of different traits, possibly from overlapping samples. We demonstrate MTAG using data on depressive symptoms (Neff = 354,862), neuroticism (N = 168,105), and subjective well-being (N = 388,538). Compared to 32, 9, and 13 genome-wide significant loci in the single-trait GWASs (most of which are novel), MTAG increases the number of loci to 74, 66, and 60, respectively."You have got to be kidding. This strikes me as some serious mathematical manipulation. I notice they are still working out the kinks, though, since the quote above was from a March, 2017 version and then in July of 2017, they provided a slight correction...
...Compared to 32, 9, and 13 genome-wide significant loci in the single-trait GWASs (most of which are themselves novel), MTAG increases the number of loci to 64, 37, and 49, respectively."
That makes it only slightly less ridiculous. Going back to our study at hand, I have several objections to the use of MTAG, this new and unproven mathematical manipulation:
1. The validity of MTAG should be assessed independently of performing new studies. In other words, one should apply this method to old studies and assess whether the new loci discovered are valid. It appears that this has never been done. The assumption in the study above introducing MTAG (then requiring a correction), was that these new loci are valid. In this current study, there was a mixture of old and new datasets, making it impossible to compare this method to standard methods. There is simply no way to assess the validity of the results garnered from using MTAG in this study.
2. The authors never do a simple GWAS of the new BioBank data for reference.
3. It is difficult enough to assess the validity of studies like this on their own. Introducing new, largely unproven, mathematical concepts leaves more room for error.
4. We have no random control. I'm not sure if it would even be possible, but perform this same MTAG on a randomized dataset using the same "Case/Control" ratios. I have already pointed out that two of the studies used in the meta-analysis (Sniekers and Okaby) also suffer from a lack of random controls. So you are taking data from studies that have not adequately proven that there results were better than random, performing an unproven mathematical construct (MTAG) on the combined data and claiming that the results (187 significant loci) are not due to random false positives without comparing it to a randomized dataset.
5. Appealing to a common sense level: If you are taking old studies and manipulating them to find more positive results, perhaps you are engaging in some wishful thinking.
I am quite comfortable taking the null hypothesis for this as follows: The original results from the primary studies used in this meta-analysis (Okbay and Sniekers) are composed of random false positive loci. The combined use of these and other datasets, along with the MTAG manipulation, have created a much larger result of false positive loci.
Therefore, I challenge the authors of all three of these studies to compare their results to randomized controls (preferably starting with Okbay and Sniekers). I believe the author of this study could perform all three of these randomized controls and end this madness.
I'd stop here, but I'll make a couple of other points, so that it doesn't seem like I'm ducking potentially positive findings. The authors suggest a bit of replication:
"Comparing the genomic loci identified using FUMA in the current study to the 16 loci identified from Sniekers et al. using FUMA, only one locus on chromosome 15 was found in Sniekers that was not present in the current study."
Well, that would sure be an interesting finding for two independent datasets, but they are using the Sniekers study data in this study, so you should expect many of the loci to remain significant. Moreover, I suspect that the MTAG manipulation increases this carryover, although I wouldn't know how to calculate that likelihood (can anyone?).
The next part of the study first involves another mathematical manipulation referred to as MAGMA, effectively taking their 187 loci and producing 538 specific genes. I am not going to go after MAGMA specifically, because this is getting exhausting, but I will address the stated purpose for doing this, which is to show that neurologically based genes are more prominent in this collection of genes. Once again, this is difficult to demonstrate without a random control, which might also point at neurologically based genes. It would require two blinded groups assessing the results of these. I suspect that the group looking at the random control would then find as many neurological correlates to the random false positives that they are examining.
The next part of the study discusses a bipolar disorder and schizophrenia in relation to intelligence:
"For the mental health variables, our meta-analytic intelligence dataset showed a pattern of genetic correlations more similar to Sniekers GWAS on intelligence than the Okbay GWAS on education. For bipolar disorder, no genetic correlation was found using our meta-analytic dataset or with the Sniekers dataset; however, a genetic correlation was found with education (r g = 0.28, SE = 0.04). For bipolar disorder, previous results have indicated a negative genetic correlation using established measures of intelligence, although after correcting for multiple tests this estimate was not statistically significant . Similar results were also found when examining schizophrenia, where a positive genetic correlation was found with education (r g = 0.10, SE = 0.02), and a negative genetic correlation was found with both intelligence datasets (Sniekers, r g = −0.20, SE = 0.03, current study r g = −0.14, SE = 0.02)."
Reading this from my perspective, it seems to show randomness. Educational attainment and intelligence are supposed to be closely genetically linked, but this seems like a chasm of distinction. Moreover, it is counterintuitive. People with serious mental disorders are more likely to have high educational attainment than high intelligence?
Here's another curious passage:
"Differences between the previous GWAS on intelligence and our meta-analysis were also evident for tiredness, anorexia nervosa, and type 2 diabetes. For these phenotypes, the point estimate of the genetic correlation is indistinguishable from zero for the intelligence GWAS but significant and in the same direction for both education and intelligence in our meta-analytic sample."
When each GWAS you perform seems to give different mechanisms and correlations, I think it's again worth considering whether we are just looking at random correlations and drawing misguided conclusions that will be negated by the next GWAS that comes around the block.
In Conclusion:
1. Two of the original studies used in this meta-analysis did not demonstrate that they are anything more than random false positives.
2. The use of MTAG is dubious. It is unproven and appears to be an attempt to artificially inflate the n and get more positive loci.
3. The large number of positive loci garnered by this meta-analysis, aided by MTAG, are not demonstrated to be anything more than random false positives and no control is provided for comparison.
4. The attempt to correlate these unproven genes to various neurological functions is not demonstrated to be any better than one could do if they had a random set of genes and wanted to correlate them to neurological functions.
5. The comparisons with diagnoses and other traits did not replicate with previous findings, in some cases appear counter intuitive, and suggest again that we may be working with random results.
No comments:
Post a Comment