Friday, December 18, 2020

Another Schizophrenia Twin Study That Only Looks Good When You Skim The Results

 In medical school, my Psychiatry Residency, and even the Psychiatry Board Exams, the concordance rate noted for Schizophrenia was 50%. The assumption here is that if one identical twin was diagnosed with schizophrenia, then the other one had a 50% chance of also being diagnosed with schizophrenia (I am told this is still the conventional wisdom). This is an impressive number, even if it doesn't explain why the other 50%, also genetically identical to their schizophrenic sibling is not also diagnosed with schizophrenia. Well, it appears this is far from accurate, as I've discovered when looking at the actual studies, which we admittedly rarely did in our training, as we filled our heads with the "facts" we needed to pass our training and board certification. It appears to be a bit of statistical sleight of hand. [Hat tip to Jay Joseph (blog linked in my blog roll) for looking at this study a bit past the abstract]:


Heritability of Schizophrenia and Schizophrenia Spectrum Based on the Nationwide Danish Twin Register

If one peruses the abstract of this study, you are met with this:

The probandwise concordance rate of SZ is 33% in monozygotic twins and 7% in dizygotic twins. We estimated the heritability of SZ to be 79%. 

Does that mean that if your identical twin has schizophrenia, you have a 33% chance of having schizophrenia? No, I don't think it does. Does it mean that you have a 79% chance with that stated heritability? No, it doesn't, either.

In fact, based on this study, if one identical twin is diagnosed with schizophrenia, the other was diagnosed with schizophrenia only 14.8% of the time. While that's higher than you would expect at random, it is a far cry from what you might think if you skim the study and feels a bit deceptive, really. So let's see where they come up with their figures.

Tuesday, December 15, 2020

Old Schizophrenia Twin Study That Tells a Different Story

 This study of Finnish Twins is originally from 1984:

Psychiatric Hospitalization in Twins

I think it makes some interesting points and I'm surprised I hadn't seen it before (Hat tip to Jay Joseph). Throughout my residency, I was told that there was a 50% concordance rate for schizophrenia among identical twins. I don't recall this study ever being referenced. It used hospitalization records and seems to have found a much lower concordance rate:
Pairwise concordance rates for schizophrenia (11.0% for MZ and 1.8% for DZ) seem to indicate great environmental influence (high proportion of discordant pairs) with apparent genetic liability (6.1-fold ratio in concordance between MZ and DZ pairs).

That's a surprisingly low figure. Perhaps because they used hospitalization records rather than interviews there was less bias or perhaps one twin wasn't hospitalized when the other was. 

Of course, one might jump on the fact that at least the concordance rate is significantly higher for MZ than DZ, even if not impressive. It's worth pointing out, though, that since doctors are regularly trained to take a family history and are more likely to diagnose someone with schizophrenia if they have a close relative with that diagnosis, that there is potential for inflation. 

I think such inflation would favor MZ twins in particular and this is an impressive point in the article:

Of the MZ pairs concordant for psychiatric hospitalization, 47% had lived together for their whole life time; of those discordant, 16% lived together. The corresponding figures for DZ pairs were 18% and 15%.

It is interesting that the MZ twins who lived together were more frequently diagnosed concordantly with schizophrenia, while not true of DZ twins. I am extrapolating, here, but it also appears that MZ twins are more likely to live together than DZ twins, which suggests some bonding that again brings into question the idea that MZ twins and DZ twins can be compared in this way (for more, see Jay Joseph's work on the EEA). 

Wednesday, December 9, 2020

Interesting Study Related to Cognitive Decline from Schizophrenia

 This study assessed whether cognitive decline from Schizophrenia has a genetic component. 

Schizophrenia polygenic risk predicts general cognitive deficit, but not cognitive decline in healthy older adults

In the early years of psychiatry, Schizophrenia was called "Dementia Praecox," a term coined by Emil Kraepelin, that described the deterioration of cognition associated with schizophrenia more so than the symptoms we normally associate with the disorder. From Wiki:

Dementia praecox (a "premature dementia" or "precocious madness") is a disused psychiatric diagnosis that originally designated a chronic, deteriorating psychotic disorder characterized by rapid cognitive disintegration, usually beginning in the late teens or early adulthood.

This term is no longer used, but the concept behind it is still accepted, that there is a progressive dementia among schizophrenic patients. The idea behind this study is that, assuming the polygenic model of schizophrenia holds true, if someone is not schizophrenic, but has a high polygenic score for schizophrenia (has a lot of the identified variants), then one might expect them to have some cognitive decline. That was not the case, as the study points out:

These results do not support the neo-Kraepelinian notion of schizophrenia as a genetically determined progressively deteriorating brain disease.

I think what this suggests is that schizophrenia, itself, is the cause of the cognitive deterioration, rather than the other way around.  Moreover, it challenges the polygenic model of schizophrenia and the idea of a "continuum" related to the number of susceptibility genetic variants, as Robert Plomin suggested in the book, "Blueprint.

Wake Up Call for Insomnia GWAS

Here is another GWAS, this time for insomnia, that I think buries the lead:

Genome-wide meta-analysis of insomnia in over 2.3 million individuals implicates involvement of specific biological pathways through gene-prioritization

Here's an alternate title:

Based on 1.3 million GWAS, the maximum variance explained was 2.6% and based on 2.3 million individuals the maximum variance explained seems to be only 2% !

- (Hat tip to Veera M. Rajagopal, twitter handle: @doctorveera, who might not really appreciate the hat tip)

Obviously, there is a problem here, when, even when finding novel loci by expanding your dataset, you are getting getting worse "variance explained" from your PRS. I think this suggests that they have already reached their peak, which seems to run in the 2 to 3% range for most behavioral traits. I will once again point out that even this number is suspect, since it is not compared to any null trait. They try to rationalize it by suggesting that that the added data (from 23andMe) was less stringently phenotyped, but you can't have it both ways. Expanding datasets does not appear to give us any more real insight. It just bumps up the number of loci meeting significance, which arguably just a collection of false positives.

As the datasets expands beyond just white Europeans, I suspect this will onlly further water down the success of these studies, since they will not be able to rely as much on pop strat to get correlations.

Bipolar Genetics Makes No Progress

Nice critique by Peter Simons of a genetic study for bipolar disorder among Han Chinese with the diagnosis  of Bipolar Disorder (original study here). A couple of excerpts:

The researchers analyzed thousands of Han Chinese people and found that genetics explained just 2.3% of whether they received a diagnosis of bipolar disorder (BD) or not...

However, it is unclear how tiny correlations like this—which affect but a tiny sample of the population studied and explain less than 3% of the risk for a diagnosis—could help researchers understand the supposed “biological etiology” of bipolar disorder. In fact, they rather show that more than 97% of the reason that someone gets a diagnosis is explained by factors other than biology. 

As I like to point out, even the 3% is quite suspect and is arguably noise and should be tested against a null trait to establish that the 3% is not the null.

Tuesday, December 1, 2020

The Unembarrassed Bot

 A shortcutting of the usual GWAS is a bot that simply cranks out a Manhattan plot with no further analysis. While, those who do traditional GWAS downplay it, there is really little difference between what they are doing and what the bot is doing other than some shoddy speculation and perhaps a bit of data cleaning, but the real issue is that the bot does GWAS that most would be too embarrassed to publish and these get a lot of hits. Take this one, for example, that ironically, without embarrassment, finds genetic variants for "worrying too long after embarrassment":


This should be a clear indication that silly false positives can be produced from anything you can ask on a questionnaire. In addition to the likelihood of some massive pop strat dependent on particular cultural backgrounds, what exactly is meant by "too long"? Is this a subjective opinion of the person or is it a specific amount of time? 


Saturday, November 28, 2020

If You Can't Make it Happen for Schizophrenia

 This is an expansion of a previous Schizophrenia GWAS:

Mapping genomic loci prioritises genes and implicates synaptic biology in schizophrenia

This is the PGC schizophrenia study. We hadn't really had an update since 2014. It appears they buried the lead with the usual false optimism. They went from 36,000 cases in the previous study to 69,000 in this one. We have been promised that polygenic risk scores (PRS) would explain more and more of the "missing heritablity" as the study sizes increased. Well, in this case, the PRS variance explained went from 3.4% to ... 2.6%. Of course, that 3.4% was apparently an error anyway.They also admit that their previous calculation of 3.4%, often cited in other papers, was calculated in error and was probably lower. Is that to make it look like the 2.6% is not that bad?

The fact of the matter is that this is a very bad result. This is not even a within family calculation, which one might expect to be very close to 0%. I think at this point, getting 2 or 3 percent of the variance explained is essentially a null finding, and I challenge any authors who claim otherwise to compare it to obvious null traits.

I might have more to say about this related to the loci they say reached significance, but can't find the old PGC data to compare it with directly. In any case, the only thing that increasing N does is bolster the number of "significant" loci and I expect none of these loci will independently meet statistical significance in any other study.

What this study really suggests, when you take away the spin, is that the entire model of a polygenic mechanism for schizophrenia is pie in the sky. This points to a larger problem, which is that if any psychiatric trait should be due to a physical (genetic) cause, one would think schizophrenia would be a sure thing. If you can't make it happen for schizophrenia, good luck making it happen for dubious diagnoses like ADHD or really any other psychiatric trait.

Friday, November 13, 2020

Genetic homogeneity does not reduce individuality (in fish)

 This Study takes genetically identical fish and puts them in indentical environments and demonstrates that they show a lot of individuality:

we find that (i) substantial individual variation in behaviour emerges among genetically identical individuals isolated directly after birth into highly standardized environments and (ii) increasing levels of social experience during ontogeny do not affect levels of individual behavioural variation. In contrast to the current research paradigm, which focuses on genes and/or environmental drivers, our findings suggest that individuality might be an inevitable and potentially unpredictable outcome of development.

 This, of course, is a fish study, but if even fish have such individuality, I think it is a good bet that the same can be said for humans.

Saturday, October 31, 2020

Genes for Walking at a Brisk Pace (yes, this is real)

 Another absurd GWAS:

Genome-wide association study of self-reported walking pace suggests beneficial effects of brisk walking on health and survival

The fact that I see scientists in the field taking this seriously rather having some self-reflection is a good indication of the lack of common sense that drives these studies. I am not going to pick through it. I hereby just mock it.

Yet Another Study Showing that Polygenic Scores are Confounded by Population Stratification

 Another study related to height PGS:

Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies

It should at some point become clear that GWAS and their cousin, the "polygenic score" are little more than measures of population stratification and other such issues in the population being studied (or the database being used). In this case, here is the money shot:

More generally, our results imply that typical constructions of polygenic scores are sensitive to population stratification and that population-level differences should be interpreted with caution.

I would be grateful if someone could tell me what interpreting with caution would look like? How about stop making these interpretations, instead? 

Sunday, September 20, 2020

GWAS Meta-analyis for Bipolar Disorder Gives Glowing Analysis, but is impossible to Interpret (Again)

 A brief review of this GWAS for Bipolar Disorder:

Genome-wide association study of over 40,000 bipolar disorder cases provides novel biological insights (Mullins et al. )

Like almost all the behavioral genetic GWAS studies, this one uses a meta-analysis, despite having new data added to previous data and the new data was never assessed (at least in print) independently. Thus, it is difficult to assess statistically what is success and what is failure, although it is filled with the usual accolades:

This GWAS provides the best-powered BD polygenic scores to date, when applied in both European and diverse ancestry samples. Together, these results advance our understanding of the biological etiology of BD, identify novel therapeutic leads and prioritize genes for functional follow-up studies.

 Well, the best and the only, really. But, of course, I have a lot of questions. The first is related to their significant loci count, and for which I needed partial clarification from one of the authors, as I will discuss after the fold (click "read more" to continue).

Saturday, September 12, 2020

Weekend at Bernie's for Behavioral Genetics

Here is what I think is an attempt by Paige Harden at a behavioral genetics pivot:

“Reports of My Death Were Greatly Exaggerated”: Behavior Genetics in Postgenomic Era

On the contrary, I'd say that this is an attempt to prop up a corpse. The piece starts by basically burying "candidate gene" studies, which were the previous propped up corpse they spent a couple of decades convincing us was proof of genetic correlations for behavior (and personality and intelligence). Well, no self-reflection about the fact that something you were sure about for so long turned out to be nothing. It's easier to throw the past in the dustbin than consider the possibility that we are still working with dust. The candidate genes were largely killed by GWAS, which appears to have been their only useful function. We are now in the second wave of this, with GWAS and pgs largely in a death spiral, which was really not acknowledged by those in the field prior to this piece, to my knowledge. Thus, I am reporting their death, and I don't exaggerate. However, Harden does exaggerate here:

Overall, GWAS results have yielded two general lessons for psychology. First, traits of interest to psychologists are massively polygenic, meaning that they are associated with thousands upon thousands of genetic variants scattered throughout the genome, each of which has a tiny effect. This has been called the fourth law of behavior genetics (Chabris et al. 2015). Second, the aggregate predictive power of measured genetic variants, in some cases, rivals the predictive power of traditional social science variables, such as family socioeconomic status (SES) (Lee et al. 2018). 

Tuesday, August 25, 2020

Parental Wealth is Pop Strat

 This paper discusses how parental wealth is a strongly prone to assortative mating. 

...parental wealth homogamy is high at the very top of the parental wealth distribution, and individuals from wealthy families are relatively unlikely to partner with individuals from families with low wealth. Parental wealth correlations among partners are higher when only parental assets rather than net wealth are examined, implying that the former might be a better measure for studying many social stratification processes. Most specifications indicate that homogamy increased in the 2000s relative to the 1990s, but trends can vary depending on methodological choices. The increasing levels of parental wealth homogamy raise concerns that over time, partnering behavior has become more consequential for wealth inequality between couples.

The reason this is important in terms of genetic studies, is that it creates population stratification that will no doubt present itself as genetic correlations, giving the impression that genes exist for income (yes such studies have been done), as well as other things like educational attainment, when all that is really happening is that the rich are keeping it in the family, so to speak.  They noted that it was less common, but still an issue in Denmark, where the study was done, but worse in other countries (US and UK, for example). Thus these GWAS that purport to show such genetic correlations are likely really demonstrating that certain social/ethnic groups have an unfair proportion of wealth. By this token, if you were going to use "polygenic scores" for educational attainment to decide who should get into an elite school, it would more fair for those with the lowest scores to be given preference...

Sunday, August 23, 2020

The Tarot of Reading Neuroimaging

 I'm linking to this piece discussing the lack of consistency among researchers reading and interpreting Neuroimaging studies, because I think this highlights why it is akin to phrenology or perhaps Tarot card readings. 

 Research Teams Reach Different Results From Same Brain-Scan Data 

 "When 70 independent teams were tasked with analyzing identical brain images, no two teams chose the same approach and their conclusions were highly variable. "

The fact that these are static neuroimages gives the impression of some sort of consistent, definitive interpretation. But Tarot Cards are also the same deck no matter who is laying out the cards. "It's in the cards," they will say. But, in truth, it is in the card reader. 

 


Monday, August 17, 2020

My Four Laws of the Behavioral Genetics Fallacy

 I discussed these in more length, here as a response to Eric Turkheimer's Three Laws of Behavior Genetics. But just wanted to lay them out in one short post (credit Turkheimer for the second, which is his third).

My Four Laws of the Behavioral Genetics Fallacy:

1. Any behavioral trait studied within a society will be correlated genetically to specific subpopulations, regardless of whether these genetic correlations are directly related to the trait.

2. A substantial portion of the variation in complex human behavioral traits is not accounted for by the effects of genes or families.

3. Differences in human behavior, intelligence and personality are not accounted for by structural or functional differences in the brain.

4. Advancements in understanding human behavior and psychology require inner exploration from the scientist, the subject or both.

Sunday, August 16, 2020

Some Comments on "The Three Laws of Behavior Genetics" (and the two other laws)

Twenty years ago, Eric Turkheimer wrote an often cited paper titled:
Three Laws of Behavior Genetics and What They Mean
This paper is still often cited today and perhaps has taken on a life of its own, with a more deterministic interpretation than Turkheimer apparently intended and for which he recently clarified in a blog post his original intent. Nonetheless, much of the criticism of his paper comes from those in the genetic determinism camp, with the extremes being the "race scientist" crowd. So, since I sit on the other end of the see saw from the genetic determinists with Turkheimer poised somewhere in the middle, I will weigh in with my own thoughts about his three laws, as well as the two additional non-Turkheimer laws added into the soup. In the process of this, I will posit my own Four Laws of the Behavioral Genetics Fallacy. First, let's lay out the three laws that Turkheimer posits:
First Law. All human behavioral traits are heritable.

Second Law. The effect of being raised in the same family is smaller than the effect of genes.

Third Law. A substantial portion of the variation in complex human behavioral traits is not accounted for by the effects of genes or families.
I agree with perhaps one and a half of these laws. I’ll start with my half agreement with the First Law. For starters, I take issue with the use of the term “heritable,” because the term predates genetics and has had many different meanings and interpretations over the years, as this article points out:
The term ‘heritability,’ as it is used today in human behavioral genetics, is one of the most misleading in the history of science. Contrary to popular belief, the measurable heritability of a trait does not tell us how ‘genetically inheritable’ that trait is. Further, it does not inform us about what causes a trait, the relative influence of genes in the development of a trait, or the relative influence of the environment in the development of a trait.

Friday, August 14, 2020

Genetic Prediction of Schizophrenia via Polygenic Risk Score Has No Clinical Utility

 A new study from Schizophrenia Bulletin tested various risk factors on a group of individuals in a Netherlands. In addition to other risk factors, they used Polygenic Risk Score (PRS) developed from previous studies. This summarizes the results:

We calculated the relative contribution of each (group of) risk factor(s) to the variance in (change in) mental health. In the combined model, familial and environmental factors explained around 17% of the variance in mental health, of which around 5% was explained by age and sex, 30% by social circumstances, 16% by pain, 22% by environmental risk factors, 24% by family history, and 3% by PRS for schizophrenia (PRS-SZ). Results were similar, but attenuated, for the model of mental health change over time. Childhood trauma and gap between actual and desired social status explained most of the variance.

 This is a weak result all around, but particularly bad was the PRS which had a predictive success of 0.5 % (3% of 17%). Thus, just knowing a person's age and sex was almost twice as predictive as the PRS. If the person had a history of pain or other medical complaints, that alone was 4 times more predictive for schizophrenia. This is simply a dismal failure and the continued hope that this will be good enough to be clinically useful is little more than wishful thinking. Realistically, to be clinically useful, it would have to be 25 to 50 times better than this and I am guessing it has come close to peaking.

Wednesday, August 12, 2020

Yet More UK BioBank Pop Strat Issues Noted.

 There are so many studies coming out noting population stratification issues that it is hard to keep track. This is an interesting preprint looking at CAD and BMI:

Fine-scale population structure confounds genetic risk scores in the ascertainment population

From the Abstract:

we investigated the accuracy of two different GRS across population strata of the UK Biobank, separated along principal component (PC) axes, considering different approaches to account for social and environmental confounders. We found that these scores did not predict the real differences in phenotypes observed along the first principal component, with evidence of discrepancies on axes as high as PC45. These results demonstrate that the measures currently taken for correcting for population structure are not sufficient, and the need for social and environmental confounders to be factored into the creation of GRS. 

One interesting aspect of this study, I think, is that it highlights how it can be necessary to have a good working knowledge of the population you are studying.  This plot is striking in that respect:

This was confined only to white European descent, but still had this kind of stratification. A larger point here is that more and more pop/strat issues arise, many of which were not accounted for in earlier studies and perhaps should lead to corrections. Moreover, for those doing GWAS in the future, particularly in the UK BioBank, it is worth having a bit of skepticism that at least some of what you are seeing is pop/strat that has yet to be recognized.

 

Wednesday, July 22, 2020

Another Paper Related to Pop Strat issues for GWAS

Another study discussing pop strat issues:

Demographic history impacts stratification in polygenic scores

Points out more issues with population stratification:
We show that when population structure is recent, it cannot be fully corrected using principal components based on common variants—the standard approach—because common variants are uninformative about recent demographic history.
They further note some limitations with sibling based studies:
While sibling-based association tests are immune to stratification, the hybrid approach of ascertaining variants in a standard GWAS and then re-estimating effect sizes in siblings reduces but does not eliminate bias. 
As I've argued previously, the "immune to stratification" point is not necessarily true secondary to factors like varying ages of the siblings and selections biases of the databases. Nonetheless, if using sibling studies  "reduces but does not eliminate bias," and they are bringing the variance explained down to 2 or 3 %, then arguably they are scraping along near the null. So, far from showing that some of the variance explained is retained in sibling studies, it might suggest that there is no real genetic component found.

Finally, it's worth pointing out that despite the growing number of studies showing pop/strat issues in the UK Biobank and other such databases, no one has taken it upon themselves to reevaluate their previous, published GWAS results in light of this. It's as if they are grandfathered in.

Friday, July 17, 2020

Depression genetic study finds nothing.

This study:
Analysis of 50,000 exome-sequenced UK Biobank subjects fails to identify genes influencing probability of psychiatric referral
Speaks for itself. There is, of course, the  usual hope for the future:
  It seems unlikely that depression genetics research will produce findings that might have a substantial clinical impact until far larger samples become available.

There is simply no reason to continue believing at this point that such genetic variants will be found. They simply don't exist. There needs to be a cutoff at which point this would be acknowledged, or this shell game will never end.

Thought Experiment on Genetics and Society

I did something like this on Twitter, but will expand it here:

Let's say we live in a society where all the citizens are genetically identical (1 male and 1 female genetic code) and further that progeny, through laboratory manipulation or the like, retain the same genetic code from one generation to the next:

Will there still be a social hierarchy in such a society? Wouldn't a society require professionals and a working class. If it was along the lines of our current system, some would be doctors and some janitors and some field workers. Would those born to wealthy and well educated families have a leg up in also achieving educational and professional success? Might one also expect some homeless, some people with substance abuse problems, some people who are unhealthy? Some who turn to a life of crime? Some who would be humanitarians? Would there not eventually be wars and some way in which groups would be prejudiced towards other groups? Would people be more alike, or would they go out of their way to differentiate and be even more diverse in personality?

This is all quite obvious, isn't it? Gene hunting is not going to uncover human nature. We are human beings first. For the most part, genes simply display the costume that each of us wears. The exceptions to this fact are simply that: exceptions.  Marking people's personal traits by identifying generally unrelated genetic variations does little more than create meaningless divisions in our society and a perception of humans as genetic automatons.

Thursday, July 16, 2020

"Educational Attainment" and the Wobbly Null

New study related to genetic studies of Educational Attainment:
Avoiding dynastic, assortative mating, and population stratification biases in Mendelian randomization through within-family analyses
Like a previous study, it makes the point that within family analysis significantly "attenuates" the educational attainment, in this case related to correcting for height. Here's the rub, though. In the previous study, the fact that EA was significantly diluted by within family results was somehow lauded as a demonstration along these lines: "At least there is still something, so it proves there is some genetic component to EA." This study, however seems to take the opposite approach:
The Mendelian randomization estimate using the sample of unrelated individuals implied that each 10 cm increase in height caused an increase of 0.17 (95%CI: 0.14–0.20, p-value = 8.5 × 10−26) years of education. After allowing for a family fixed effect, the Mendelian randomization estimate was greatly attenuated suggesting little evidence of a causal effect of height on education
In this case, the attenuation was taken as evidence of a null value, to demonstrate that they were able to get the pop strat out of the picture. However, if something like height has even a small effect on EA and the tiny results for genetic variants for EA after within family analysis, then it's worth asking whether there are any actual genetic variants related to someone being better or smarter in a way that allows them to get more education (c'mon, this should be obvious), or whether there are just a few physical confounders giving us the slight variance accounted for. You can't have it both ways.

Friday, June 19, 2020

Genes for Substance Abuse Has Made No Progress, but Unjustified Optimism Continues

Yet another genetic study of substance abuse:

Using polygenic scores for identifying individuals at increased risk of substance use disorders in clinical and population samples
Highlights:
These PRSs explain ~2.5–3.5% of the variance in AUD (across FT12 and COGA) when all PRSs are included in the same model. 
...usefulness for identifying those at increased risk in their current form is modest, at best 
This was from an all white European sample, ftr, with the assumption that pop strat is accounted for. One can assume, as has been the case, that such pop strat will be found and water this down to next to nothing. That said, is the null 0% or is 2 or 3 % about as low as you can get? I'd be happy to see an example in which a PRS does worse than this.

So is the conclusion that perhaps we are barking up the wrong tree? Of course not:
 Improvement in predictive ability will likely be dependent on increasing the size of well-phenotyped discovery samples. 

The shell game continues...

Friday, April 17, 2020

Nice piece on genetic correlation vs causality

This piece:
What Causes Genes?A genetic association doesn't necessarily mean a genetic cause.
Gives a good overview of why genetic correlations don't necessarily live up to their billing. (From Jaime Derringer, Ph.D.).

Addendum: As a reader mentions, this piece was apparently inspired by this study:
Population phenomena inflate genetic associations of complex social traits.

From that paper:
In conclusion, our results demonstrate some of the causal structures that may bias univariate and bivariate genetic estimates such as heritability and genetic correlations, particularly when applied to complex social phenotypes. 

Friday, April 10, 2020

Study Showing Weakness of PGS, even within ancestry

I already put up a blog post on the preprint of this new study last year:

Variable prediction accuracy of polygenic scores within an ancestry group
Here is the Abstract:
Fields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group (i.e., when there are negligible differences in linkage disequilibrium or in causal alleles frequencies), the prediction accuracy of polygenic scores can depend on characteristics such as the socio-economic status, age or sex of the individuals in which the GWAS and the prediction were conducted, as well as on the GWAS design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use. 
Damning on its face, but the authors appear to not want to give up the ship, and give only a few passing mentions of pop/strat and other confounding issues with these large genetic databases. At what point do you reject the model if the studies aren't giving you the expected results? Time will tell...

Tuesday, March 24, 2020

More Bias in DNA databanks.

This study:
Genetic analyses identify widespread sex differential participation bias
is yet another example of the bias problems in these large consumer and other databases. This one looked at several, including 23andMe and the UK Biobank.
With 23andMe, a GWAS just for "male vs. female" had 150 "signficant" loci and many of these loci were previously correlated to complex traits from other GWAS that used the database. This is a problem, because it suggests that many of the previously discovered loci for particular traits might actually just be an indication of bias in the databank and have no causal relationship to the trait as the authors point out:
Finally, we demonstrate how these biases can potentially lead to incorrect inferences in downstream analyses and propose a conceptual framework for addressing such biases. Our findings highlight a new challenge that genetic studies may face as sample sizes continue to grow.
A broader problem related to this is... Every GWAS performed to date using the biased databank, since this form of bias was not recognized when those studies were performed. I don't expect it to happen, of course, but this should lead to a reevaluation of any GWAS previously performed using the database with a correction that will further dwindle the results. Sex differences is an easy to recognize bias to test for, but there are no doubt many more that remain unrecognized and the fact of the matter is, that you will never  be completely sure you have eliminated them all, so you can never say for sure whether you are finding anything but noise in these studies (I think that is the case, for the record, with behavioral genetic phenotypes). So in addition to population stratification issues in these studies, which also never seem to be fully recognized, the databases themselves have their own stratification issues.

Interesting Update: Another study just came out that incidentally looked at the same thing in the UK BioBank. This one found NO hits. I think this is likely a good demonstration of how participation bias created a very large number of false positives (23andMe) vs. the UK BioBank, which perhaps didn't have the same participation bias and shows that a large number of "significant" hits can be produced simply with noise. Again, we are left with the question of whether anything from these studies are true genetic correlations.



Friday, March 13, 2020

The Trickle down of GWAS to Race Science

I like to point out that many of the genetic studies related to "IQ," "g" and "Educational Attainment," whether or not their intentions were good, tend to attract racists of varying degree, from the smooth-talking race scientists down to white nationalists and overt racists trying read the study as a whites are smarter than blacks because of their genes misinterpretation (leaving aside the fact that most of the studies are unreplicatable). This study which examines which people tend to pick up particular studies on social media sites like Twitter quantifies this and notes:
Our study provides conclusive quantitative evidence that white nationalists and adjacent communities are engaging with the scientific literature on Twitter. Not only are these communities a ubiquitous presence in the social media audience for certain research topics, but they can dominate the discourse around a particular preprint and inflate altmetric indicators.
Often, once this process begins, the scientists involved in the study and other experts in the field attempt to debunk this misappropriation of the science. Unfortunately, this does little more, in my view, than amplify the debate in a "both sides" dichotomy, effectively giving credence, or at least attention, to the racist views. While scientists will try to defend or find a use for such studies to justify their existence, these are often a reach and fall flat, leaving one to ask what purpose they serve other than to energize racists? 

Tuesday, February 4, 2020

Genes for Getting Beaten Up or Mistreated as a Child (Yes, this is a real study)

This is an actual genetic study to which some people are proud to have their attached:

Genomic influences on self-reported childhood maltreatment

The study failed to replicate (big surprise), and is entirely bogus and I could go through it and pick it apart but, instead, I am just going to say that the implication here is that there are genes for getting beaten up as a kid. I guess, that's what they are trying to say, anyway. I am just going to say that I find this disgusting and refuse to even engage with it further. The authors should be ashamed of themselves for printing it. I'll also point out that this is another example of a UK Biobank study where the application for the use of the database was deliberately vague. If you need to use subterfuge to get your study printed, then you are being doubly unethical.  Hopefully, as noted in my previous post, this type of drivel will be prevented from further use of the UK Biobank.