top of page

The association between childhood language development and adolescent psychotic experiences in a general population sample.

The Sylvia and Christine Wastall PhD Scholarship 2019: Population Health Sciences, University of Bristol 

Supervisor: Dr S. A. Sullivan



Language is unique to humans. It uses special brain systems for grammar and vocabulary, and other abilities such as memory and attention. Language is also social. Its main purpose is to exchange thoughts and feelings with others. Language disruptions are found in disorders such as dementia and stroke and in serious mental health problems such as schizophrenia. Language problems for some mental illnesses may occur in childhood, long before people get ill. Detection of subtle language differences, which may not be detectable by ear, may identify people who are at risk of future mental health problems meaning that they can get help earlier. Previous research has investigated this question using parents' reports of their child's language, which may not be accurate. We propose ground-breaking work using recordings of 8-year-old children speaking and computer software that automatically analyses certain language features, such as the way children group words, which indicates their language development. Some of these children later reported psychotic experiences (like those experienced by people with schizophrenia), at ages 12, 18 and 24 years. We will investigate whether those who reported experiences had differences in their language at age 8 years, compared to those who did not. If differences are found, this would be a cheap and simple way to detect children at risk and help right individuals to reduce risk of later mental health problems.

Student: Sarah Hemingway


Hello, I’m Sarah and I’m due to start my PhD at the University of Bristol in October 2019.        


After a decade since finishing my Psychology degree at the University of Liverpool, I decided to do a MSc in Clinical and Health Psychology at Bangor University. My decision to do so was influenced by my time volunteering at the NHS Perinatal Mental Health Service in Sheffield. I developed a keen interest in postpartum psychosis, and upon discovering how little it had been researched, I became enthused to learn more about psychosis, particularly the cognitive aspects of it. Prior to this, I had worked as a mental healthcare assistant at the Clatterbridge Psychiatric Unit in Wirral. Having observed the complexities associated with psychotic episodes on the wards, and then years later, listened to the negative experiences of mothers who had endured it, prompted me to return to my studies. I did so, due to wanting to pursue a career in research in this area. My ambition is to further our understanding about the origins and nature of psychosis, in the hope of finding interventions which will improve outcomes.


Sarah Hemingway.jpg

The PhD will allow me to dedicate my time to research the interests I care about. I will have the chance to build upon my knowledge, as well as advance and broaden my research skills. It will involve investigating an association between childhood language development and psychotic experiences in adolescence, under the guidance and supervision of Dr. Sarah Sullivan, Dr. Yvonne Wren and Professor Rosemary Varley. I am grateful to them and Mental Health Research UK for this amazing opportunity, for which I am eager to begin. 

Scientific goal: To investigate whether specified parameters of childhood language predict later psychotic experiences.

Progress Report year 3, 2022

Preparation of transcripts for analysis


Over 3000 transcripts have been prepared for the analysis of formulaic language in children at age 8 who later participated in interviews measuring psychotic experiences. This part of the process was time consuming due to being met with challenges and delays. The main issue being the application of a protocol designed for adults with clinical language abnormalities i.e., aphasias, onto developmental data without these types of problems. This led to adapting the protocol, by applying research led instructions to the data I was working with. During this time, I have had to think about how confounding variables may influence my results and put in a request to ALSPAC to obtain them, so I can input them into my statistical analysis. I have also done another search of the literature to keep updated with any new research, done a draft of a chapter dedicated to formulaic language and a rough plan of the method section. 



Cancellation of short courses due to Covid meant I was unable to gain the knowledge and skills needed. When these courses started again, I attended and obtained study materials for the following courses: Introduction to Linear and Logistic Regression Models, Multiple Imputation for Missing Data and Casual Inference in Epidemiology. All of which are beneficial to the stage I am currently at.


What’s next


The remainder of the year will involve running the transcripts through the Frequency in Language Analysis (FLAT) software, a programme which creates formulaic variables. After, the statistical analysis for this study will be performed. Following this, a second study of latent semantic analysis, which assesses content within the data will be performed. The writing up of my thesis will begin at the start of the following year.


Delays to my project


Personal and Covid related problems have set me back during my research, therefore a request for an extension has been put in place. This will allow me to get back on track and complete the goals I have put in place. 


Progress Report year 2, 2021

My 2nd Year


A nested case-control study design will be used to ascertain whether an association exists between formulaic language use in children and later psychotic experiences. Formulaic language are fixed verbal expressions that make up to about a third to a half of everyday language and are thought to be processed differently than novel language. This study aims to determine the degree of formulaicity in the language of children who later developed psychotic experiences and compare it to those who did not. In the second year of my PhD, I had planned to perform the language analysis, and prepare transcripts of spoken language for this.

My research will be using the Frequency in Language Analysis Tool (FLAT), a computerised language analysis, devised and created by a team at UCL1. FLAT automatically produces frequency measurements of words and word combinations using the British National Corpus (BNC), which are then used to determine the degree of formulaicity in individual samples. Before the analysis can begin, the transcripts need to be arranged in a format that can be recognised by the BNC (a 100 million collection of written and spoken words). I have started tagging the transcripts by following a protocol, which was originally created for adults with specific speech problems. Therefore, applying these instructions to developmental data has caused some issues to emerge, leading to many consultations with my supervisor at UCL (one of the authors of the protocol), for extra guidance on this. With over 3000 transcripts to complete, I am still in the process of doing this, and so the analysis is yet to be done.


Along with preparing transcripts, I have also done a draft of my Introduction and I have made a start writing the Methods section. I have attended online courses on wellbeing, which has helped me gain the knowledge and skills needed to cope and get through problems which can occur during a PhD. Due to Covid-related problems, I have received study materials for the courses, Introduction to Regression and Multiple Imputation for Missing Data. It is my intention to apply for these again next year, in the hope of also learning from experts teaching on these courses, as they will be beneficial for when I perform the statistical analyses.

Disruptions caused by Covid


The last lockdown, implemented in early January, disrupted my return to Bristol. This meant my Christmas stay in Sheffield had to be extended. Spending longer with my family, after over a year of not being able to see them, was lovely. However, I had poor internet connection (a consistent problem since working in the home environment) and difficulties working in a full house, along with other PhD related issues. This made working with the data extremely difficult, especially trying to listen to audio recordings. Since coming back to Bristol, many months later, I have had to rearrange and adapt my plans, and unfortunately this has meant that a preliminary study investigating the typical trajectory of formulaic language use in children at ages 5 and 8, could not be performed this year. Covid for me, has led to less-than-ideal working conditions and has created a lot of stressful challenges, but they have been met with resilience and a determination to persist and overcome them.


1 devised by Vitor Zimmerer and programmed by Michael Coleman and Mark Wibrow.

Progress Report Year 1, 2020

I started my PhD at the beginning of October 2019, as a mature student settling into a new life at Bristol University. I have been introduced to people from different professional and academic backgrounds and attended seminars of guest speakers and other PhD students, to increase my awareness of any current research, as well as discovering the interests of those within the department. I have been on courses and workshops relevant in developing the skills and knowledge needed to aid me with my PhD, such as an Introduction to Epidemiology, and how to use STATA.

My PhD is investigating whether an association exists between childhood language development and later psychotic experiences. One of the aims is to perform language analyses on the transcripts of children at aged 8, obtained from audio files within the Avon Longitudinal Study of Parents and Children (ALSPAC) data. With transcribing still to be done, one of the first tasks upon my arrival was to receive training for this. Soon after, I got to manage a team of transcribers, giving me the opportunity to implement and build upon my project management and problem-solving skills.

It was important to familiarise myself and achieve an understanding of the processes involved in the handling of the ALSPAC data. Particularly, in relation to their guidelines on ethics and conduct. This knowledge served me well for when wanting to send confidential data within and outside of the university. It also helped when needing to get transcribers access to the data.

The handling of big data is a new experience for me. Therefore, a good portion of my time has been spent trying to understand and organise this. This involves recording and keeping up to date with what has been transcribed so far, finding missing data, as well as ascertaining how much of the data is useable e.g. after eliminating faulty recordings from the study. The project uses audio files from a different research study. Therefore, learning the types of tasks the children participated in was also required.

Impact of Covid-19

The restrictions caused by the Coronavirus have led to the re-prioritising of my time and work. Some seminars I hoped to attend, and the courses I wished to do have moved online, allowing me the opportunity to resume developing my skills. One particular area I wish to improve on is my confidence in public speaking. I will seek to do this as soon as the chance arises. I will continue reading about the background of my research. I have so far conducted a literature search, collated journals, and kept a literature review matrix to help me map and identify gaps in the literature. My aim is to plan and write my first draft, as this will allow me to keep revising it throughout my PhD. This time will also be spent learning about the language analyses, particularly how it will be applied to our study. I will also learn, practise, and carry out the procedure needed to prepare the transcripts ready for analysis.

Overall, despite the disruption, I am really enjoying my PhD, as well as working with all three of my supervisors, who have made me feel very supported during this time. I feel what I have learned so far has put me in good stead to achieve my goals, and to do it to the best of my ability.

bottom of page