Neil Davies discusses a new paper on a genome-wide association study of almost 180,000 siblings and discusses what additional insight siblings bring to such studies.
Thousands of genome-wide association studies (GWAS) have been published, however, the vast majority have used samples of unrelated individuals. We have recently published a sibling GWAS published in Nature Genetics. In our study, we used almost 180,000 siblings across 19 studies from around the world. But why are siblings interesting for GWAS?
GWAS have already identified tens of thousands of single nucleotide polymorphisms (SNPs) related to phenotypes – using samples of unrelated individuals. However, correlation is not equal to causation. Increasing evidence suggests these associations can be driven by more than individual-level biological effects.
There can be three key sources of bias. The first potential bias is population stratification. This means the differences in the frequency of the genetic variants that relate to phenotypic differences. For example, Iron Brew consumption will associate with variants more common in Scotland. These associations are biased evidence of the causal effect of the variant on the phenotype!
The second bias is assortative mating. People don’t mate at random. For example, studies have shown that more educated people tend to have more educated and taller partners. Such trends can result in biased associations between SNPs and phenotypes in the offspring.
The third bias is indirect parental genetic effects (also known as dynastic effects).
In these, the genotype is expressed in parents, which in turn affects offspring outcomes. One example of this is that the education of parents may influence educational outcomes in the offspring, again biasing SNP-phenotype associations.
How can data from siblings help overcome these biases? Siblings inherit their genetic variants from their parents at random. They are nature’s randomized control trials. If the siblings who share the genotype have more similar trait measures, researchers can be more confident that the genotype is influencing the trait directly.
Looking at the differences between siblings controls for each of the sources of bias above.
Which phenotypes suffer most from these biases? In our Nature Genetics paper, we estimated the shrinkage from the population to sibling estimates for 25 phenotypes, to see which suffered most from these biases. We estimated this by looking at how much the associations shrunk between the population estimates (without comparing within siblings), to the within sibling estimates. The larger shrinkage in the LD-score regression plot below indicates more bias.
We found that previously reported genome-wide association study (GWAS) associations, which typically use more widely available population samples of unrelated individuals, tend to overestimate direct effects for many traits including educational attainment, cognitive ability, age when first gave birth, whether someone has ever smoked, depressive symptoms and number of children. We also found that estimates of heritability, genetic correlations and other genetic analysis methods could substantially differ when calculated using estimates from siblings.
Biases do affect genetic correlations
A major finding from our research was that these biases do affect genetic correlations. When we use sibling cohorts, the genetic correlations from LD-score regression between educational attainment and traits such as height and BMI are not detected. Note the change in power and precision in the plot below. This suggests that the correlations that are detected in population samples are unlikely to be due to a causal effect of the genetic variants in the individuals.
Are recent findings on polygenic adaption robust to these biases? Yes, height is likely to be under polygenic selection. This suggests that selective pressures in the human population have affected the number of height-associated alleles in the population. This could lead to changes in the average height of the population over multiple generations.
Are sibling samples “better” than “population” samples?
Whether sibling samples such as we use in our study are “better” than population studies depends on the question you want to look at. Large population-based samples of unrelated individuals are great if you want to discover new genetic variants associated with a disease or other outcomes, or are interested purely in prediction.
However, if you are interested in understanding why genetic variants associated with an outcome like height, BMI, or education, then family studies can provide a powerful source of evidence. In this paper, we only looked at a very small number of phenotypes, but these results suggest that these biases are more likely for social/behavioural phenotypes, and more biological ones are less likely to be biased.
What’s next? The international collaboration established for this study is continuing to work together and explore these issues further. The next steps include using bigger samples of siblings and estimating the relative contribution of these sources of bias using samples of parent-offspring trios.
A massive thanks to all our co-authors – an international group of 100 scientists were involved in this study – and many, many others. Amazing being able to work with you all!
In the 1970s two randomised trials of aspirin led by Professor Peter Elwood from the MRC Epidemiology Unit, South Wales made the headlines for finding that a low dose of aspirin had beneficial effects for patients who had had a heart attack.
This was just one of many important discoveries from over 50 years of epidemiological research carried out in South Wales in MRC units, including the Epidemiology Unit initially directed by Archie Cochrane, and then by Peter Elwood.
Peter joined the unit in South Wales in 1963 and led it from 1974 until it closed in 1995. Over that time he led on converting the unit from one which had researched respiratory disease and other issues to one with a focus on cardiovascular disease (which had shown increasing rates since the 2nd world war, whilst pneumoconiosis and tuberculosis decreased). Since the unit closed in 1995 he has continued working, producing more than many people do who are still employed.
In recognition of Peter’s long and valuable career, IEU’s Professor George Davey-Smith and Professor John Gallagher (Director of the Dementia Platform UK, University of Oxford), who both worked with Peter in the unit, are organising a half-day meeting in Peter’s honour on 19th May, 14:00-17:30 BST, in Oxford and online.
The meeting will feature speakers from Bristol, Oxford and UCL including Professors Nishi Chaturvedi, Andy Ness, Sir Michael Marmot and Sir Richard Peto, as well as Peter himself, discussing important topics in epidemiology. These include the role of alcohol in cardiovascular disease; how diet influences disease risk; potential causal relationships between diabetes and dementia; health inequalities, productive research environments and aspirin.
On Wednesday 7th July 2021, I brought together key stakeholders with an interest in improving opioid substitution treatment (OST) from across the UK. This included people with lived experience, Public Health England staff, local authority public health practitioners, treatment service leads, pharmacists and academics. We discussed the findings of my recently completed PhD, and together we considered the next stages of developing an intervention to improve OST.
A summary of my research
For those not familiar with the topic, OST refers to the treatment of opioid dependency with either methadone or buprenorphine (alongside psychosocial support). Through my research, I wanted to understand what the key facilitators and barriers are to people ‘recovering’ in OST. To do this, I drew on both quantitative and qualitative methodologies. I found that loneliness, isolation and experiences of trauma and stigma were key barriers to recovery; whereas positive social support, discovering a sense of purpose and continuity of care were valuable facilitators.
Importantly, some factors appear to act as both facilitators and barriers to recovery in OST. For instance, I found that some service users used isolation as a form of self-protection (to shield themselves from negative influences), however this often left them feeling lonely and disconnected from the potential benefits offered by developing more positive social support networks.
Undoubtedly, the strongest barrier to recovery was stigma. Service users told me that they experience stigma from a range of sources, including from family and friends, healthcare professionals and members of the wider community. I found similar patterns in the literature review that I carried out (Carlisle et al, 2020). Stigma is like a stain where an individuals’ entire identity is defined by a single, negative attribute. In the case of OST, individuals may possess overlapping stigmatised identities of ‘OST service user’, ‘drug user’ and ‘injecting drug user’. Some will be further stigmatised due to experiencing homelessness, being HIV or Hepatitis C positive or through involvement in sex-work.
“I found that loneliness, isolation and experiences of trauma and stigma were key barriers to recovery”
Community pharmacies are one environment where service users report experiencing a great deal of stigma. Unlike customers collecting other prescriptions, many OST service users receive their medications (methadone/buprenorphine) through an arrangement known as ‘supervised consumption’. This means they must be observed taking their medication by a pharmacist to ensure that it is not diverted to others. This is often conducted in full view of other customers, despite guidelines which recommend that this takes place in a private room or screened area. This leaves OST service users open to the scrutiny of the ‘public gaze’.
My findings have several implications in relation to stigma. Firstly, OST service users receive poorer care than other members of society in healthcare settings, which may result in them avoiding seeking help from drug treatment and for other health conditions. Secondly, stigmatising OST service users makes community re-integration extremely challenging and this has been directly linked to individuals returning to drug using networks as it is somewhere they feel a sense of belonging. The ultimate impact of being repeatedly exposed to stigma is an internalisation of these judgements, resulting in feelings of shame and worthlessness – again impacting on individuals’ ability to seek help and develop supportive new relationships with others.
What we discussed during the workshop
Being able to present these findings to key stakeholders was a real highlight of my PhD work; it’s not often that you have the ear of so many invested and engaged individuals in one ‘room’ (albeit a Zoom room!). The findings of my PhD chimed closely with the experiences of those in the room and would be further reflected the next day when Dame Carol Black’s Review of Drugs Part 2 was published, which made specific reference to stigma.
After I presented a short overview of my PhD findings, attendees spent time in small groups discussing how we might address OST stigma at each level of the socioecological system (see figure 1, above). A common thread that ran through each of the groups’ discussions was the importance of embedding interventions within trauma-informed frameworks. Attendees felt that increasing others’ understanding of the impact of trauma and ‘adverse childhood experiences’ (ACEs) may be a key mechanism by which to reduce stigma towards OST service users.
Indeed, a recent study found promising results in relation to this – that increasing the public’s awareness of the role of ACEs in substance use reduced stigmatising attitudes towards people who use drugs (Sumnall et al, 2021). Workshop attendees suggested that this outcome could be achieved through trauma-informed training of all individuals who might work with OST service users, such as pharmacists, the police and medical professionals, as well as those who work in healthcare settings, such as receptionists.
At the individual level there was a discussion about the way that stigma trickles down the socioecological system, resulting in self-stigma or internalised stigma. People felt that the best way to reduce this was to tackle stigma higher upstream first.
When thinking about reducing stigma in everyday inter-personal interactions, people highlighted the importance of using non-stigmatising language. For those who are interested (and I think we all should be!) the Scottish Drug Forum has published an excellent guide here.
Some excellent suggestions were made for reducing stigma that individuals experience in organisations such as pharmacies, hospitals and other settings. This is something that Dr Jenny Scott and I discussed in a recent article for the Pharmaceutical Journal (Scott & Carlisle, 2021). One attendee suggested the introduction of positive role-models within organisations who could be an exemplar of positive behaviour for others (a ‘stigma champion’ perhaps?). Training was identified as a key mechanism through which stigma could be reduced in organisations, including through exposure to people who use drugs (PWUD) and OST service users during training programmes. It was stressed however, that this should be carefully managed to ensure that a range of voices are presented and not just ones supporting dominant discourses around abstinence-based recovery.
Suggestions for improving community integration included increasing access to volunteering opportunities – something that people felt has been impacted by reduced funding to recovery services in recent years. It was also suggested that community and faith leaders could be a potential target for education around reducing stigma and understanding the impact of trauma, as these individuals may be best placed to have conversations about stigma with members of their communities.
Finally, there were some thoughtful discussions around the best way to influence policy to reduce stigma. The importance of showing policymakers the evidence-base from previous successful strategies was highlighted. Something that resulted in a lively debate was the issue of supervised consumption with arguments both for and against (this is also relevant at the organisational level). The group summarised that whilst diversion of medications was a risk for some, a blanket approach to supervised consumption and/or daily collections exposes individuals to stigma in the pharmacy, which leaves individuals vulnerable to dropping out of treatment. It was pointed out that supervised consumption policies were quickly relaxed at the start of Covid-19 restrictions – something that appears to have been done safely and with benefits to service users. It was also highlighted that supervised consumption in OST is inherently stigmatising, as users of other addictive drugs with overdose potential, such as other prescribed opioids and benzodiazepines, are not subjected to the same regulations. This sends a clear message to OST service users that they cannot be trusted. Other key suggestions were:
Communicating with CQCs and Royal Colleges, who may be particularly interested in understanding how people are treated in their services.
Drawing on existing stigma policies from other arenas e.g. mental health.
Highlighting the fiscal benefits of reducing stigma to key decision makers.
Tapping into plans for the new Police and Crime Commissioners, who have a trauma sub-group.
Linking into work with ADDER areas, which may provide the evidence for ‘what works’.
I am now planning to apply for further funding to develop an intervention to reduce organisational stigma towards OST service users. The involvement of service users and other key stakeholders will be crucial in every step of that process, so I will be putting together a steering group as well as seeking out collaborations with academics internationally that have expertise and an interest in this area. I was really pleased to see that Dame Carol Black’s second report makes some concrete recommendations around reducing stigma towards people who use drugs. I hope therefore to be able to work with the current momentum to make OST safer and more attractive to those whose lives depend on it.
I’d like to extend my gratitude to all of the attendees at the workshop and to Bristol’s Drug and Alcohol Health Integration Team (HIT) for supporting this event. If you are an individual with lived experience of OST, an academic, or any other stakeholder working in this area and would like to be involved with future developments, please get in touch with me at firstname.lastname@example.org or find me on Twitter at @Vic_Carlisle.
Carlisle, V., Maynard, O., Padmanathan, P., Hickman, M., Thomas, K. H., & Kesten, J. (2020, September 7). Factors influencing recovery in opioid substitution treatment: a systematic review and thematic synthesis. https://doi.org/10.31234/osf.io/f6c3p
Sumnall, H. R., Hamilton, I., Atkinson, A. M., Montgomery, C., & Gage, S. H. (2021). Representation of adverse childhood experiences is associated with lower public stigma towards people who use drugs: an exploratory experimental study. Drugs: Education, Prevention and Policy, 28(3), 227-239. https://doi.org/10.1080/09687637.2020.1820450
This blog was originally posted on the TARG blog on the 1 October 2021.
Drs Luisa Zuccolo and Cheryl McQuire, Department of Population Health Sciences, Bristol Medical School, University of Bristol.
Soon after the World Health Organisation (WHO) declared COVID-19 a pandemic on March 11th 2020, the UN declared the start of an infodemic, highlighting the danger posed by the fast spreading of unchecked misinformation. Defined as an overabundance of information, including deliberate efforts to disseminate incorrect information, the COVID-19 infodemic has exacerbated public mistrust and jeopardised public health.
Social media platforms remain a leading contributor to the rapid spread of COVID-19 misinformation. Despite urgent calls from the WHO to combat this, public health responses have been severely limited. In this project, we took steps to begin to understand and address this problem.
We believe that it is imperative that public health researchers evolve and develop the skills and collaborations necessary to combat misinformation in the social media landscape. For this reason, in Autumn 2020 we extended our interest in public health messaging, usually around promoting healthy behaviours during pregnancy, to study COVID-19 misinformation on social media.
We wanted to know:
What is the nature, extent and reach of misinformation about face masks on Twitter during the COVID-19 pandemic?
To answer this question we aimed to:
Upskill public health researchers in the data capture and analysis methods required for social media data research;
Work collaboratively with Research IT and Research Software Engineer colleagues to conduct a pilot study harnessing social media data to explore misinformation.
Dr Cheryl McQuiregot the project funded and off the ground. Dr Luisa Zuccolo led it through to completion. Dr Maria Sobczykchecked the data and analysed our preliminary data. Research IT colleagues, led by Mr Mike Jones, helped to develop the search strategy and built a data pipeline to retrieve and store Twitter data using customised application programming interfaces (APIs) accessed through an academic Twitter account. Research Software Engineering colleagues, led by Dr Christopher Woods, provided consultancy services and advised on the analysis plan and technical execution of the project.
Cheryl McQuire, Luisa Zuccolo, Maria Sobcyzk, Mike Jones, Christopher Woods. (Left to Right)
Too much information?!
Initial testing of the Twitter API showed that keywords, such as ‘mask’ and ‘masks’, returned an unmanageable amount of data, and our queries would often crash due to an overload of Twitter servers (503-type errors). To address this, we sought to reduce the number of results, while maintaining a broad coverage of the first year of the pandemic (March 2020-April 2021).
I) Searched for hashtags rather than keywords, restricting to English language.
II) Requested original tweets only, omitting replies and retweets.
III) Broke each month down into its individual days in our search queries to minimise the risk of overload.
IV) Developed Python scripts to query the Twitter API and process the results into a series of CSV files containing anonymised tweets, metadata and metrics about the tweets (no. of likes, retweets etc.), and details and metrics about the author (no. of followers etc.).
V) Merged data into a single CSV file with all the tweets for each calendar month after removing duplicates.
What did we find?
Our search strategy delivered over three million tweets. Just under half of these were filtered out by removing commercial URLs and undesired keywords, the remaining 1.7m tweets by ~700k users were analysed using standard and customized R scripts.
First, we used unsupervised methods to describe any and all Twitter activity picked up by our broad searches (whether classified as misinformation or not). The timeline of this activity revealed clear peaks around the UK-enforced mask mandates in June and September 2020.
We further described the entire corpus of tweets on face masks by mapping the network of its most common bigrams and performing sentiment analysis.
We then quantified the nature and extent of misinformation through topic modelling, and used simple counts of likes to estimate the reach of misinformation. We used semi-supervised methods including manual keyword searches to look for established types of misinformation such as face masks restricting oxygen supply. These revealed that the risk of bacterial/fungal infection was the most common type of misinformation, followed by restriction of oxygen supply, although the extent of misinformation on the risks of infection decreased as the pandemic unfolded.
Extent of misinformation (no tweets), according to its nature: 1- gas exchange/oxygen deprivation, 2- risk of bacterial/fungal infection, 3- ineffectiveness in reducing transmission, 4- poor learning outcomes in schools.
Relative to the volume of tweets including the hashtags relevant to face masks (~1.7m), our searches uncovered less than 3.5% unique tweets containing one of the four types of misinformation against mask usage.
A summary of the nature, extent and reach of misinformation on face masks on Twitter – results from manual keywords search (semi-supervised topic modelling).
A more in-depth analysis of the results attributed to the 4 main misinformation topics by the semi-supervised method revealed a number of potentially spurious topics. Refinements of these methods including iterative fine-tuning were beyond the scope of this pilot analysis.
Our initial exploration of Twitter data for public health messaging also revealed common pitfalls of mining Twitter data, including the need for a selective search strategy when using academic Twitter accounts, hashtag ‘hijacking’ meaning most tweets were irrelevant, imperfect Twitter language filters and ads often exploiting user mentions.
We hope to secure further funding to follow-up this pilot project. By expanding our collaboration network, we aim to improve the way we tackle misinformation in the public health domain, ultimately increasing the impact of this work. If you’re interested in health messaging, misinformation and social media, we would love to hear from you – @Luisa_Zu and @cheryl_mcquire.
In two-sample Mendelian randomization (MR), a type of epidemiological method, we combine the results from different genetic studies to study the causal relationship between human characteristics and disease. For example, we might take results from a genetic study of smoking and a different genetic study of cancer. We can combine their results to understand whether smoking might be a cause of cancer. If the same position in the genome is associated with smoking in the first study and with cancer in the other study, this can provide evidence that smoking is a causal factor in cancer. However, it’s also possible that this position in the genome could be related to smoking and cancer via separate pathways. This phenomenon is known as “horizontal pleiotropy” and is a common source of bias in Mendelian randomization research.
Another, often under-appreciated, source of bias are errors in metadata. To understand this we need to understand what genetic results look like in practice. Below is an example of a genetic results file with 5 rows and 6 columns (a typical file might actually have several million rows).
Each row refers to a single position in the human genome that varies between people. These positions are referred to as “genetic variants” (also known as polymorphisms). The particular type of variant that an individual carries is known as their allele.
Below is an example of metadata. The metadata helps us understand the contents of the results file. It tells us what the columns represent.
Some columns in the results file will describe the relationship (“effect” column) between the genetic variant and some human characteristic (e.g. smoking) and there will be additional columns that help researchers interpret this relationship. These additional columns include things like the identity of the allele that is used to model the relationship (e.g. if people have allele “A” they may be more likely to smoke compared to people without this allele) or information on how common the allele is in the population. These columns are also known as the “effect allele” and “effect allele frequency” columns. Metadata errors refer to mistakes in how these columns are reported. For example, maybe allele1 is reported as the effect allele column when in fact it should have been allele 2 that is described in this way. Sometimes the information provided in metadata is ambiguous. For example, the metadata tells us that the “freq” column represents allele frequency but there are two alleles. Is this the frequency of allele1 or allele2? We can’t be sure. Another type of error refers to mistakes in the reported results, for example reporting that a genetic variant increases the probability that a person smokes when in reality it has no effect (in other words the effect is zero). This is known as a summary data error. Failure to identify these errors can lead to mistakes in Mendelian randomization analyses, such as finding that smoking protects against cancer (when we know the opposite is true).
As research complexity increases, so does the potential for errors
These types of errors were fairly easy to avoid during the early years of Mendelian randomization research, when studies tended to be hypothesis-driven and focused on small numbers of relationships (although errors still occurred). Mendelian randomization study designs are, however, increasingly complex and hypothesis-free, sometimes assessing relationships amongst 100s or even 1000s of characteristics and diseases. New online platforms and databases that collate genetic results from many different sources, and provide tools that can automate analyses, make these studies easier to undertake than ever before. The downside is that they probably make meta and summary data errors more likely.
Maximising metadata quality to reduce errors
We address this issue in a new pre-print: “Design and quality control of large scale Mendelian randomization studies”. We present an R package and set of quality control tools that identify meta and summary data errors, which we developed for the Fatty Acids in Cancer Mendelian Randomization Collaboration (FAMRC). The FAMRC is a pan-cancer MR study that seeks to evaluate the causal relevance of fatty acids for risk of major cancers. We wanted to maximise the quality of the genetic study results we collected from the cancer studies, to ensure the integrity of our Mendelian randomization analyses. After implementing our tools, we found major meta and summary data errors in 7 (13%) of 55 genetic studies in the FAMRC.
What types of metadata errors did we find?
The basic principle of our quality control approach is to identify errors through
comparison of the results of individual studies in the FAMRC to external studies
comparison of reported to expected results.
For example, we identified genetic variants that are known to cause cancer and checked that the same variants had the expected relationship in the FAMRC. In the figure below, every data point represents a single genetic variant that is known to increase cancer risk. The horizontal or X axis shows the known relationship in the GWAS catalog (this is a database of known genetic associations with 1000s of human characteristics in 1000s of genetic studies) and the vertical or Y axis shows the relationship in one of the studies in the FAMRC. Each axis shows the Z score, which is basically a standardised measure of how each genetic variant affects cancer risk (positive values mean that the variant increases risk of cancer and negative values indicate they decrease risk). As you can see, in the FAMRC study on the vertical Y axis, almost all the variants have negative Z values (indicating they reduce cancer risk), when in fact they are known to increase risk (the true relationship is represented by Z scores in the GWAS catalog). This discrepancy was caused by a metadata error, where the effect allele column was incorrectly labelled. We also found that the “frequency of the effect allele” was wrong. How common the allele is in the population was opposite to what we’d expect, based on comparison with other studies, confirming the presence of metadata errors.
Various other types of errors were identified, including one study reporting that 100s of genetic variants had very strong effects on fatty acid levels when in fact they had no effect at all. For example, in the figure below, the many red data points refer to genetic variants in the FAMRC that had a very large effect on fatty acids but were not reported in the GWAS catalog, suggesting a potential problem with the genetic results.
We also compared the reported results (how the genetic variants affected fatty acids in the FAMRC) to predicted results (how we would expect the genetic variants to affect fatty acids). In the figure below we see a “fanning-out” pattern, when what we should see is a strong linear relationship (i.e. the data points lying on a single straight line). This relationship can be summarised with the “slope” metric. We should see a slope of 1 (this means if the reported result increases by 1 the predicted result will also increase by 1), which is not the case. We confirmed with the data provider that low quality genetic variants had not been excluded from their study. Once the low quality variants had been excluded, the discrepancies disappeared.
Avoiding metadata errors: recommendations for researchers
When conducting Mendelian randomization analyses using results from genetic studies, researchers can avoid metadata and other errors by:
Requesting results for genetic variants that are known to affect their disease of interest. Researchers should check that these variants have the expected effect in their dataset.
Comparing the frequency of genetic variants to expected frequencies in a reference dataset. We created a special reference dataset that can be used for this purpose (accessible via the CheckSumStats R package).
Not assuming that results have had low quality variants excluded, but instead seeking confirmation of this with data providers. Our quality control tools also provide a way to check this.
Further attention is needed to address the growing diversity of GWAS
One issue we only partly addressed was the “two-sample assumption”: that the studies being compared come from the same population. In our own analyses, we found that the frequency of genetic variants was very similar across European-origin studies, indicating satisfaction of the assumption. On the other hand, our tools were not really optimised for this purpose. The need to assess the “same population” assumption is becoming more urgent with the growing diversity of genetic studies.
In conclusion, meta and summary data errors are an under-appreciated source of bias in MR research, especially in complex study designs. We developed an R package and set of tools that can be used to flag meta and summary data errors in the results of genetic studies, which in turn can be used to enhance the integrity of Mendelian randomization analyses. Our tools and methods are available to other researchers via the CheckSumStats R package.
Design and quality control of large-scale two-sample Mendelian randomisation studies
There’s a widespread belief that your testosterone can affect where you end up in life. At least for men, there is some evidence for this claim: several studies have linked higher testosterone to socioeconomic success. But a link is different to a cause and using DNA, our new research suggests it may be much less important for life chances than previously claimed.
In previous studies, male executives with higher testosterone have been found to have more subordinates, and financial traders with higher testosterone found to generate greater daily profits. Testosterone has been found to be higher among more highly educated men, and among self-employed men, suggesting a link with entrepreneurship. Much less is known about these relationships in women, but one study suggested that for women, disadvantaged socioeconomic position in childhood was linked to higher testosterone later in life.
The beneficial influence of testosterone is thought to work by affecting behaviour: experiments suggest that testosterone can make a person more aggressive and more risk tolerant, and these traits can be rewarded in the labour market, for instance in wage negotiations. But none of these studies show definitively that testosterone influences these outcomes because there are other plausible explanations.
Rather than testosterone influencing a person’s socioeconomic position, it could be that having a more advantaged socioeconomic position raises your testosterone. In both cases, we would see a link between testosterone and social factors such as income, education and social class.
There are plausible mechanisms for this too. First, we know that socioeconomic disadvantage is stressful, and chronic stress can lower testosterone. Second, how a person perceives their status relative to others in society might influence their testosterone: studies of sports matches, usually between men, have often found that testosterone rises in the winner compared to the loser.
It’s also possible that some third factor is responsible for the associations seen in previous studies. For instance, higher testosterone in men is linked to good health – and good health may also help people succeed in their careers. A link in men between testosterone and socioeconomic position could therefore simply reflect an impact of health on both. (For women, higher testosterone is linked to worse health, so we would expect an association of higher testosterone and lower socioeconomic position.)
Look at it this way
It is very difficult to pick apart these processes and study just the effects of testosterone on other things. With this goal in mind, we applied a causal inference approach called “Mendelian randomisation”. This uses genetic information relevant to a single factor (here, testosterone) to isolate just the effect of that factor on one or more outcomes of interest (here, socioeconomic outcomes such as income and educational qualifications).
A person’s circulating testosterone can be affected by environmental factors. Some, like the time of day, are straightforward to correct for. Others, like somebody’s health, are not. Crucially, socioeconomic circumstances could influence circulating testosterone. For this reason, even if we see an association between circulating testosterone and socioeconomic position, we cannot determine what is causing what.
This is why genetic information is powerful: your DNA is determined before birth and generally does not change during your lifetime (there are rare exceptions, such as changes which occur with cancer). Therefore, if we observe an association of socioeconomic position with genetic variants linked to testosterone, it strongly suggests that testosterone is causing the differences in socioeconomic outcomes. This is because influence on the variants of other factors is much less likely.
In more than 300,000 adult participants of the UK Biobank, we identified genetic variants linked to higher testosterone levels, separately for men and women. We then explored how these variants were related to socioeconomic outcomes, including income, educational qualifications, employment status, and area-level deprivation, as well as self-reported risk-taking and overall health.
Similar to previous studies, we found that men with higher testosterone had higher household income, lived in less deprived areas, and were more likely to have a university degree and a skilled job. In women, higher testosterone was linked to lower socioeconomic position, including lower household income, living in a more deprived area, and lower chance of having a university degree. Consistent with previous evidence, higher testosterone was associated with better health for men and poorer health for women, and more risk-taking for men.
However, there was little evidence that genetic variation related to testosterone affected socioeconomic position at all. In both men and women we detected no effects of genetic variants related to testosterone on any aspect of socioeconomic position, or health, or risk-taking.
Because we identified fewer testosterone-linked genetic variants in women, our estimates for women were less precise than for men. Consequently, we could not rule out relatively small effects of testosterone on socioeconomic position for women. Future studies could examine associations in women using larger, female-specific samples.
But for men, our genetic results clearly suggest that previous studies may have been biased by the influence of additional factors, potentially including the impact of socioeconomic position on testosterone. And our results indicate that – despite the social mythology surrounding testosterone – it may be much less important for success and life chances than earlier studies have suggested.
‘Enhancing the utilization of COVID-19 testing in schools’, is a study which will look at the characteristics of long COVID and COVID-19 infection in children. ‘Long COVID’ is commonly used to describe signs and symptoms that continue or develop after acute COVID‑19. The study is being funded as a result of a rapid funding call by Health Data Research UK (HDR UK), the Office for National Statistics (ONS) and UK Research and Innovation (UKRI). The study forms part of the larger Data and Connectivity National Core Study, which is led by HDR UK in partnership with ONS.
The COVID-19 testing in schools study is related to the CoMMinS (COVID-19 Mapping and Mitigation in Schools) study being undertaken by the University of Bristol in partnership with Bristol City Council, Public Health England [PHE] and Bristol schools. CoMMinS aims to give us an understanding of COVID-19 infection dynamics centred around school pupils and staff and onward transmission to family contacts, using regular testing. Our study will jointly analyse data from CoMMinS, along with information from Electronic Patient Records, and data from the COVID-19 Schools Infection Survey (SIS; jointly led by the London School of Hygiene & Tropical Medicine [LSHTM], PHE, and ONS). The SIS is a study similar to CoMMinS but carried out nationally.
To help inform research questions and methods for the study, members from the University of Bristol study team gathered views about long COVID in children between 9 March and 30 April 2021 from:
seven young people from the NIHR Bristol Biomedical Research Centre Young People’s Advisory Group (YPAG)
five families whose children have long COVID or suspected long COVID, recruited through two online UK campaign groups for long COVID, and
a survey completed by four GPs and one paediatrician, and an online meeting with two paediatricians.
It is important to note that the opinions gathered were based on small samples which may not be representative.
Through the meeting and survey with the doctors, the study team found that clinical understanding of long COVID in children is currently very limited.
The doctors said that it may be hard to distinguish between long COVID and other conditions with similar symptoms. Many of the symptoms of long COVID, like fatigue and feeling sick, aren’t very specific, and are common to many different conditions. Long COVID in children currently lacks a clinical definition, making diagnosis difficult. It isn’t yet properly understood whether long COVID is a new condition in itself, or a group of conditions like post viral fatigue, which is already recognised.
Young people, and families of children with long COVID or suspected long COVID, who were also asked for their opinion, said that feeling sick or stomach pain, extreme tiredness, and headaches were the symptoms they would rank as most ‘harmful’. For young people, this was based on them imagining having the symptoms. For the families, this was based on their first-hand experience.
The families also said that the symptoms their children were experiencing were numerous, often very severe, and more wide-ranging than those currently listed on the NHS website for long COVID. It is not yet clear what is causing the unusual symptoms.
The families said that they had struggled to get a diagnosis and treatment for their children. They also said that long COVID symptoms were having a significant impact on their children’s day-to-day lives both physically and psychologically, and that some of the children had missed school because of the symptoms. Some of the families also found fevers difficult to manage because their children had to miss school to self-isolate every time they had a fever. They wanted to know why the set of symptoms were being experienced, and why their children in particular had developed them.
It is not known how many children have or will develop long COVID. So far, studies which have tried to measure the rate of long COVID in children suggest it is rare. However, quantifying the number of cases is made difficult by a lack of clinical understanding of long COVID including the lack of an agreed clinical definition. The opinions collected suggest that relying on clinical diagnoses alone will under-estimate cases. On the other hand, there needs to be a cautious approach to estimating the number of cases based on non-specific symptoms, as other conditions which cause similar symptoms may be counted as well.
Caroline Relton, Professor of Epigenetic Epidemiology and Director of the Bristol Population Health Science Institute at the University of Bristol, joint lead for CoMMinS and one of the lead authors of the report, said: “The opinions we gathered further highlight that it is difficult to count the number of children with long COVID on the basis of diagnoses alone while long COVID in children remains poorly defined.
“There are added complications of studying long COVID in children, when it is sometimes difficult to disentangle what might be the result of experiencing infection from what might result from the wider impact of experiencing the pandemic. Isolation, school closures, disrupted education and other influences on family life could all have health consequences. Defining the extent of the problem in children and the root causes will be essential to helping provide the right treatment and to aid the recovery of young people who are suffering.”
The findings highlight that examining GP and hospital visits, and school attendance, might currently be a more useful and feasible way of assessing how COVID-19 has affected children, rather than relying only on diagnoses of long COVID. However, the study researchers also need to be aware how often healthcare is accessed according to need, and absence from school due to self-isolation, which will affect what is being measured.
Feeling sick or stomach pain, extreme tiredness, and headaches will be important symptoms to consider in the study.
1 Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, BS8 2BN
2 Medical Research Council (MRC-IEU), University of Bristol, Bristol, BS8 2BN
3 Cancer Research UK (CRUK) Integrative Cancer Epidemiology Programme (ICEP), University of Bristol, Bristol, BS8 2BN
The causes of cancer are often preventable
Cancer, a disease that has a profound impact on the lives of individuals all over the world, also has an ever-increasing burden. And yet, evidence indicates that over 40% of all cancers are likely explained by preventable causes. One of the main challenges is identifying so-called ‘modifiable risk factors’ for cancer – aspects of our environment that we can change to reduce the incidence of disease.
The gut microbiome could influence cancer risk
The gut microbiome is a system of microorganisms that helps us digest food, produce essential molecules and protects us against harmful infections. There is growing evidence supporting the relationship between the human gut microbiome and risk of cancer, including lung, breast, bowel and prostate cancers. For example, experiments have shown that changing the gut microbiome (e.g., by using pre- or pro-biotics) may reduce the risk of developing colorectal cancer. Research also suggests that people with colorectal cancer have lower microbiota diversity and different types of bacteria within their gut compared to those without a diagnosis.
As the gut microbiome can have a substantial impact on their host’s metabolism and immune response, there are many biological mechanisms by which the gut microbiome could influence cancer development and progression. However, we don’t yet know how the gut microbiome can do this.
Human studies in this context have used small samples of individuals and measure both the microbiome and disease at the same time. These factors can make it difficult to tease apart correlation from causation – i.e., does variation in the gut microbiome change someone’s risk of cancer or is it the existence of cancer that leads to variation in the gut microbiome? This is an important question because the main aim of such research is to understand the causes of cancer and how we can prevent the disease. We want to fully understand whether altering the gut microbiome can reduce the burden of cancer at a population level or whether it is simply a marker of cancer itself.
I’m inviting feedback on your knowledge and understanding of the gut microbiome and cancer – please take this 5-minute survey (click here for survey) to contribute your thoughts.
People are interested in their gut microbiome
Even though we don’t yet know much about the causal relevance of the gut microbiome, there is still a growing market for commercial initiatives targeting the microbiome as a consumer-driven intervention. This usually involves companies obtaining a small number of faecal samples from consumers and prescribing “personalised” nutritional information for a “healthier microbiome”. However, these initiatives are very controversial given uncertainty in the likely relationships between the gut microbiome, nutrition and various diseases. What these activities do highlight is the demand for such information at a population level. This shows there is an opportunity to improve understanding of the causal role played by the gut microbiome in human health and disease.
Microbiome and variation in our genes
Using information about our genetics can help us find out whether the gut microbiome changes the risk of cancer, or whether cancer changes the gut microbiome. Genetic variation cannot be influenced by the gut microbiome nor disease. Therefore, if people who are genetically predisposed to having a higher abundance of certain bacteria within their gut also have a lower risk of, say, prostate cancer, this would strongly suggest a causal role of those bacteria in prostate cancer development. This approach of using human genetic information to discern correlation from causation is called Mendelian randomization.
Studies relating human genetic variation with the gut microbiome have proliferated in recent years. They have provided evidence for genetic contributions to features of the gut microbiome including the abundance or likelihood of presence (vs. absence) of specific bacteria. This knowledge has given the opportunity to apply Mendelian randomization to better understand the causal impact of gut microbiome variation in health outcomes, including cancer. There are, however, many important caveats and complications to this work. Specifically, there is a (currently unmet) requirement for careful examination of how human genetic variation influences the gut microbiome and interpretation of the causal estimates derived from using Mendelian randomization within this field.
This research has already shown promise in the application of Mendelian randomization to improve our ability to discern correlation from causation between the gut microbiome and cancer. It has importantly highlighted the need for inter-disciplinary collaboration between population health, genetic and basic sciences. Thus, with the support from my team of experts in microbiology, basic sciences and population health sciences, this Fellowship will set the scene for the integration of human genetics and causal inference methods in population health sciences with microbiome research. This will help us understand the causal role played by the gut microbiome in cancer. Such work acts as a new and important step towards evaluating and prioritising potential treatments or protective factors for cancer prevention.
The research conducted as part of this Cancer Research UK Population Research Postdoctoral Fellowship will be supported by the following collaborators: Nicholas Timpson, Caroline Relton, Jeroen Raes, Trevor Lawley, Lindsay Hall and Marc Gunter, and my growing team of interdisciplinary PhD students and postdoctoral researchers. I’d also like to thank the following individuals for comments on this feature: Tom Battram, Laura Corbin, David Hughes, Nicholas Timpson, Lindsey Pike and Philippa Gardom. Additional thanks go to Chloe Russell, a brilliant photographer with whom I collaborated to create “Up Your A-Z” as part of Creative Reactions 2019, who provided the photos for this webpage.
About the author
Dr. Wade’s academic career has focused on the integration of human genetics with population health sciences to improve causality within epidemiological studies. Focusing on relationships across the life-course, her work uses comprehensive longitudinal cohorts, randomized controlled trials and causal inference methods (particularly, Mendelian randomization and recall-by-genotype designs). Kaitlin’s research has focused on understanding the relationships between adiposity and dietary behaviours as risk factors for cardiometabolic diseases and mortality. Having been awarded funding from the Elizabeth Blackwell Institute and Cancer Research UK, Kaitlin’s work uses these methods to understand the causal role played by the human gut microbiome on various health outcomes, such as obesity and cancer. Since pursuing a career in this field, Kaitlin has already led and been key in several fundamental studies that with path the way to resolve – or at least quantify – complex relationships between genetic variation, the gut microbiome and human health. In addition to her research, Kaitlin is actively involved in organising and administering teaching and public engagement activities as well as having many mentorship and supervisory roles within and external to the University of Bristol.
Hughes, D.A., Bacigalupe, R., Wang, J. et al. Genome-wide associations of human gut microbiome variation and implications for causal inference analyses. Nat Microbiol 5, 1079–1087 (2020). https://doi.org/10.1038/s41564-020-0743-8.
Kurilshikov, A., Medina-Gomez, C., Bacigalupe, R. et al. Large-scale association analyses identify host factors influencing human gut microbiome composition. Nat Genet 53, 156–165 (2021). https://doi.org/10.1038/s41588-020-00763-1.
Breastfeeding saves lives and prevents illness. It is environmentally friendly and profoundly important to children’s long-term development. After all, breast milk is the only food that has evolved specifically to feed humans.
Every parent knows that infant feeding is a complex issue, often evoking strong emotions based on personal experience. Difficult or negative breastfeeding experiences can fuel a defensive “breastfeeding denialism” attitude.
Conversely, some breastfeeding advocates refuse to acknowledge that for some families, formula is necessary for medical, personal, societal or socioeconomic reasons. These extreme attitudes cause a tense and unproductive environment for researchers working to generate inclusive evidence-based guidance for infant feeding.
Unfortunately, these tensions often detract from the energy and resources that breastfeeding advocates, researchers, health professionals and policy-makers could be using to advance their shared goal of supporting maternal and child health.
What can be done
Of course, members of the diverse breastfeeding advocacy and research communities will not always agree — but we should aim to find common ground and work together. There are many stakeholders involved, each with a role to play:
Governments and non-profit funding organizations should acknowledge the importance of breastfeeding and breast milk and invest more resources into this field.
Researchers should build interdisciplinary teams to study breast milk as a biological system and think broadly about “breastfeeding challenges” in the context of complex social systems – including social inequities, parental leave policies, lactation difficulties and donor breast milk.
Companies, researchers and advocacy groups should co-develop a conflict of interest framework for research on breastfeeding and breast milk and reporting of results.
Messaging is key to achieving these goals. All groups need to communicate effectively with each other, and with the health-care, research and public sectors. This means providing or sharing clear resources to convey scientific evidence free of conflict of interest, targeted to each audience, such as fact sheets for policy-makers, engaging videos for the public and infographics for health-care providers.
Progress in breastfeeding, breast milk and lactation research is being hampered by tensions among researchers, advocates and industry.
As breast milk scientists, breastfeeding researchers and lactation specialists, we are concerned about these tensions and their potential to impede or delay discoveries in our field. Last year, we held a workshop to discuss these concerns and develop solutions.
Our workshop paper was written before the pandemic, but its recent publication is timely. The pandemic has brought researchers together in ways that seemed impossible before.
In a paper recently published in the journal Addiction, Hannah Charles and colleagues suggest that the prevalence of illicit drug use among 23-25 year olds in a Bristol-based birth cohort (ALSPAC) is over twice that reported in the Crime Survey for England and Wales (CSEW). The team propose that these figures reflect under-reporting in the CSEW, although they note that they may reflect higher levels of illicit drug use in Bristol. Here I present some preliminary data supporting their view that the CSEW underestimates illicit drug use.
In March 2020, I recruited 683 UK university students to participate in a short survey on drug use via the online survey platform Prolific which has been shown to produce reliable data. I recruited only students aged 18 to 24 years who reported using alcohol in the past 30 days, and participants reported whether they had used any of MDMA/ecstasy, cocaine or cannabis in the past two years.
Table 1. Prevalence of self-reported illicit drug use across three surveys of young people in the UK
via ProlificAged 18-24
Any illicit drug usea
Notes: Values represent percentage of participants (number of participants). Percentages for CSEW and ALSPAC are taken from Charles et al (1) and are weighted percentages. a ‘Any illicit drug use’ refers only to the illicit drugs assessed in the respective surveys (only cannabis, MDMA and cocaine in our survey), more drugs in ALSPAC and CSEW – see Charles et al (1). b Our Prolific survey asked about ‘MDMA / ecstasy’ use, ALSPAC categorised ecstasy/MDMA use along with other ‘amphetamine’ use and CSEW asked about ‘ecstasy’ use.
Over half of my sample reported using at least one of cannabis, cocaine or MDMA in the past two years (Table 1). This is markedly higher than the CSEW’s estimates of either past year or lifetime use, and more in line with those reported in ALSPAC. Comparing across drugs, past two-year use of the three drugs is higher in my survey than either past year or lifetime use in the CSEW, and higher than past year, but lower than lifetime use in ALSPAC. Perhaps of more interest than ever use of the drugs over the past two years, I also examined the combinations of drugs students in my survey were using (Table 2). I find that the majority of students who report using illicit drugs have only used cannabis in the past two years (25% of all students), although the second largest group (15%) have used all three of cannabis, MDMA and cocaine.
Table 2. Prevalence of self-reported illicit drug among UK university students
Qualtrics survey of university students (past two years)
Illicit drug use
MDMA / ecstasy
Illicit drug use profiles
Alcohol only (no illicit drug use)
Any illicit drug usea
Cannabis + Cocaine + MDMA
Cannabis + MDMA
Cannabis + Cocaine
Cocaine + MDMA
Notes: Values represent percentage of participants (number of participants). ‘Illicit drug use’ refers to participants reporting any use of the three drugs in the past two years. ‘Illicit drug use profiles’ refers to the combinations of drugs participants report using in the past two years. a ‘Any illicit drug use’ refers only to use of cannabis, MDMA and cocaine.
There are some important differences between my sample and both the CSEW and ALSPAC samples. Some differences may mean that my figures are overestimates, including sampling university students who are more affluent than the general population (although drug use is not necessarily higher among students than non-students) and only including those who reported drinking alcohol (although according to the study authors, over 95% of the ALSPAC participants report past year drinking). Other differences may mean my figures are underestimates, including only asking about use of three drugs (thereby underestimating ‘any illicit drug use’), and the younger average age of my sample. I also report on past two-year use, rather than either lifetime or past year use as per CSEW and ALSPAC. Given these differences, I would like to run a larger, more representative sample on the Prolific platform (Prolific allows researchers to recruit a sample which is representative of the general population), to get an estimate of illicit drug use which is more comparable to ALSPAC and CSEW.
Despite these differences, my data support those reported by Charles and colleagues. Indeed, I find it unsurprising that illicit drug use is under-reported in the Home Office’s CSEW. The validity of self-reports for sensitive issues has long been a concern. Over-reporting of illicit drug use is not considered to be a concern and numerous methods have been developed for preventing under-reporting (see a 1997 NIDA report on this issue, as well as more recent techniques for estimating prevalence of use such as the crosswise method). It is important to consider the context in which surveys are administered, including participants’ perception of who is asking the questions and for what reason. It seems that if drug use is asked about in a research context, (e.g., with a clear research objective, informed consent and no possibility of repercussions), the validity of responses may be higher than when questions are asked by organisations that are perceived to be involved in the punishment of people who use drugs (e.g., governments, universities).
While the CSEW recognises that it does not reliably measure problematic drug use, my data and that of Charles and colleagues provide evidence that CSEW’s claim that it is a ‘good measure of recreational drug use’ may be wrong. Although it may be convenient to believe that only a small subset of the population uses illicit drugs, accurate information may galvanise policy makers (both nationally and locally, including at universities) into developing drugs policies that reflect reality and which support, rather than criminalise, the large proportion of the population who choose to use drugs. Indeed, this is what we’re doing at the University of Bristol, where it has been accepted that drug use is relatively common among our students and we’re providing support and education to those students who need it.