Friday, November 13, 2020

Phrenology is Alive and Well in 2020

 When the Human Genome Project began in 1990, there were visions of identifying genes for psychiatric disorders, personality traits, intelligence, character, etc. The dawning of a new age. So let's check in and see how that's going 30 years later in this exciting new study:

Genome-wide meta-analysis of brain volume identifies genomic loci and genes shared with intelligence

This, friends, sounds a bit like phrenology.  


If you are not familiar with phrenology, it was the study of the various protrusions of the skull to determine a person's intelligence, character, personality and, well, whatever you wanted to read into these lumps and bumps, measured by a pair of calipers. It was discredited a long time ago and has some very racist undertones that were driving it as well. Anyone interested in this subject might start with Stephen Jay Gould's, "The Mismeasure of Man." 

The idea behind measuring bumps on the skull is that they told you something about the size or shape of the brain underneath. This premise, in and of itself, that bigger brain regions make a person "more" of something, whether that's intelligence, violence or madness, is absurd enough, but the idea that you could measure it via skull protrusions adds another layer of absurdity.

Well, this study tries to get around that, at least partially, by using a better set of calipers. In this case, MRI (magnetic resonance imaging).

They also keep things rather vague by simply stopping at total "brain volume", which is actually far less specific than our old school phrenology friends. So it's just having a bigger brain makes you more intelligent. This is elementary school level of understanding. 

They measured the brain volume in a very crude fashion as follows:

...BV estimated from structural MRI by summing total gray and white matter volume, and ventricular cerebrospinal fluid volume. 

Thus, no real distinction is even made between 3 very different things in the brain. Then they took this measurement and did a meta-analysis. In other words combining it with other studies for which they already knew the results. I say more about that in a bit, but first I should point out that the brain estimate they obtained by MRI was not done for all the studies in the meta-analysis. Some of the other studies used different methodology:

We conduct a GWAS of BV in the UK Biobank (UKB) sample, and meta-analyze the results with two additional cohorts for which data on intracranial volume (ICV) and head circumference (HC), a proxy measure for BV

Well, if you thought the brain volume measure they used for the UKB sample was crude, using head circumference makes phrenology seem state of the art by comparison. It also leads me to ask an uncomfortable question and the authors, were they to come across my musings, are welcome to respond: If you have a UK Biobank N = 17,000 person study that uses one measure, why add another 30,000 participants to your study that used an entirely different methods for the measurement that at best are roughly correlated to what should be a superior method (MRI)? I suspect the answer to that is that a higher N gives more results.

Returning to the meta-analysis,  I will make a couple of points. First to their primary result when linking "brain volume" to various genetic loci:

We identify 18 genomic loci (14 not previously associated), implicating 343 genes (270 not previously associated) and 18 biological pathways for BV.

Such a result should lead a skeptic to ask a few questions. The first question is why 14 of these loci were not previously associated. My understanding here is that they started with 30,000 from from the previous cohorts and they added the 17,000 from the UKBiobank and suddenly found all kinds of new genomic loci associated with brain volume that hadn't been seen before. Why would that be? How would one know that these weren't simply false positives? Well, one wouldn't, unless one was able to replicate this, and I suspect (very strongly) that that will not happen in any real sense. 

You might think, reading the above result, that at least there were 4 previously identified loci that were found in this study and hang your head on that as some kind of replication. Well, unfortunately, that is not the case, here. These 4 were from the previous cohorts that were used in this meta-analysis! Thus, they were found significant in previous studies and retained significance when the UK BioBank data was added.

This should suggest an obvious follow-up question: How many loci previously found to reach significance in cohorts used in this meta-analysis were NOT found to meet significance? Interestingly, the authors fail to mention this in their study. So I looked at the previous studies and count about 20 or so loci that met significance. The number is a bit of an estimate, since there were a few different measures used (pediatric vs. adult and some specific brain regions rather than total brain volume). That said, I am quite sure that if any of these loci had replicated in this study, they would have been noted. Therefore, from the ~ 20 loci found significant in the previous studies, only 4 were significant when the UK Biobank cohort was added, even though about 60% of the data used in this meta-analysis was from the previous studies.

If these loci were in fact, legitimately correlated to brain volume and not just false positives, one should expect that adding more data would simply drive the point home that the correlation is real. Therefore, it appears likely that they were false positives. Moreover, how many of 20 loci would you expect to retain significance when you add more data if the initial results were in fact false positives. I think it's fair to say that only four retaining significance is an indication that we are dealing only with false positives in the way that a higher number of coin flips will move us inevitably to 50-50 randomness.

Thus, 16 out of 20 loci couldn't even replicate when the data that originally found them to meet significance was included and it is likely the other 4 will fade as more data are added. The study, in effect found nothing but a bunch of new, unreplicated loci, with no explanation for why these loci were not found previously.

This is usually the point where I admonish the authors of a study like this for not doing an independent GWAS of the new data (in this case, the UK BioBank data) and challenge them to do so, with the assumption that none of the loci that were found in the previous studies would independently replicate. Interestingly, though it isn't given much mention, the authors did do that.  

Within the UKB data, we identified 3,610 genome-wide significant (GWS; P < 5 × 10−8) variants, tagging 9 independent genomic loci. 

So independently performing a GWAS of the UK Biobank data resulted in 9 loci reaching genome-wide significance. This brings up a couple of questions. The first is whether any of these 9 loci were also found in the other cohorts. This is difficult to determine, as they don't lay out the actual results and they don't combine the two previous studies. They do however, have unlabeled Manhattan plots for all three cohorts (graphs showing which loci are above the significance threshold). 


a) is the UK BioBank Manhattan plot. I won't make too much of this and the authors are welcome to clarify, but what I'm seeing are three different Manhattan plots without any obvious symmetry, so I will assume unless shown otherwise, that there was no independent replication amongst any of these.

Another question related to the 9 loci found in the independent GWAS of the UK Biobank is how many of these match the 18 found in the combined GWAS. This information is also not immediately available, but going through the lead SNP's that met significance in the meta-analysis and comparing them to the listed result for that same SNP in the UK BioBank study yielded 6 hits, so I will assume that 6 of the 9 from the UK Biobank GWAS retained significance when the other cohorts were added. Again, we see a diminishing result suggestive of false positives losing significance as more data is added. Interestingly, the assumed 3 other loci found significant in the UK Biobank GWAS seem to get no mention in the paper. One thing this highlights is the fact that the order of data collection and GWAS use can give us more and different loci, as I facetiously pointed out in a recent post.

So when you have 3 studies using disparate measures of the brain, that have different significant correlations, then combine them and create mostly new correlations and none of this has been independently replicated, where do you go next? Well, you compare your results to a GWAS of another trait, apparently. In this case they compare it to a GWAS "intelligence" measured by g. Leaving aside the fact that g is controversial in its own right, why would you compare these particular traits. Well, because you are obsessed with a century and a half idea that a big brain is going to make you smarter. This is just childish stuff.

In this case, as best I can tell, they don't even directly compare the significant loci from the brain volume GWAS with the loci from the intelligence GWAS, presumably because they also don't match (I welcome the authors to clarify). Instead they compare "gene sets" in what I think is a rather convoluted way.

We found a significant positive genetic correlation with intelligence using previously published GWAS result53, confirming the genetic overlap estimated from twin studies46. We then explored whether specific genes or gene-sets drive this genetic correlation, and identified 92 genes that are associated with both traits. Of these 92, 32 genes were indicated as the most plausible ones explaining the shared genetic etiology. 

 So, lacking any direct replication, it appears that they put this all in a computer program, using MAGMA, which is now known for inflated correlations. This is not easy to critique, since it is hidden behind a lot of computer generated results, but let's ask the obvious question of why they would find 92 such genes and then consider 60 of them not plausible enough to be considered for their shared genetic etiology of brain volume and intelligence. What is the null here? Can you pick any two traits for such a comparison and come up with 92 genes associated for both traits? Well, who knows? If you did, could you then sift through the 92 and find 32 that you find plausible? Well, who knows that, either? The authors also then tried to take the correlations they found and more specifically correlate them to specific brain regions:

Specifically, using permutation analyses we compared the expression of these 92 overlapping genes across 57 brain regions to that of 10,000 randomly selected, equally large, sets of genes drawn from the 1900 genes related to either or both of the traits. Although we observed overexpression of the 92 overlapping genes in the anterior part of the fusiform (two-sided permutation test N = 10,000 permutations P = 0.015) and the parahippocampal gyrus (P = 0.005), none of the associations survived a conservative Bonferroni correction (P = 8.77 × 10−4; 0.05/57);

I think this should ring some alarm bells. If you were really going to find a genuine correlation between brain size and intelligence, why would you only find it for overall brain size, which includes likely superfluous material (spinal fluid and white matter) and is otherwise non-specific, but not find size correlations for specific areas of the brain that are associated with specific cognitive functions. Even phrenologists pushed in that direction. This should suggest that your correlations are likely false positives, in my view.

Let me close by pointing out something the authors admit in the study:

The conservative correlation of ~0.20, implies an explained variance of just 4%. This is low and therefore BV is not a good predictor of intelligence, and vice versa, intelligence is not a good predictor of BV. 

In addition to the likelihood that even this 4% is noise, what is the value of focusing on such studies? It is my view that this is ideologically driven and has no real value other than to promote the idea that some groups are genetically smarter than others. This might be unconscious bias on the part of those doing the study, but I stand by it.




No comments:

Post a Comment