The structure of sentences: an interview with Dr. Fan Bai


For this interview we spoke to Dr. Fan Bai who was a PhD student at the MPI. We talked about his research, interesting findings and what he wants to do next.

What was the main question in your dissertation?
In my thesis, I studied how the brain segments speech into smaller units, like words and syllables, and how it understands the structure of sentences. I investigated the role of statistical information, in particular the probability of certain syllables or words occurring together, and I studied how the brain discriminates between two types of syntactic structures (phrases vs. sentences) that can be similar in form and meaning.

Can you explain the (theoretical) background a bit more?
Previous studies have suggested that different types of cues help people to segment the speech they hear. We probably use different cues in parallel, such as prosody, grammar, and how often words co-occur. Therefore, it is quite challenging to separate the role of one type of cue, and show how this specific type of information is represented in the brain.

Using artificially created speech, earlier research has shown that our brain activity can reflect the rhythm of linguistic structures. For instance, if you present a new sentence every second (frequency of 1 Hertz), you can detect a peak in the listener’s brain activity at 1Hz. This suggests we use grammatical knowledge during speech segmentation.

However, grammatical information often overlaps with statistical information. Statistical information tells us how often syllables and words occur together. For instance, the probability of word A occurring after word B can be higher within a sentence, than within a list of words that do not form a sentence. Therefore, studying whether or not statistical information alone (without sentence information) can introduce the frequency effect in our brain is an interesting question.

Dr. Fan Bai

Why is it important to answer this question?
The ultimate goal of research on speech perception (how you perceive sounds) and speech comprehension (how you infer meaning from these sounds) is to build a realistic model on how the brain extracts information from auditory input to form abstract meaning. Being able to correctly chunk a stream of sounds into smaller pieces is not enough to understand the meaning of the speech you hear.

An important step in building this type of model is to further investigate how the brain represents syntactic information. The studies in my thesis provided new information that can help to draw the full picture of the mechanisms behind speech perception and comprehension.

Can you tell us about one particular project? (question, method, outcome)
In one project, I examined how the brain processes artificially created syllable sequences. I conducted six experiments, in which I asked Dutch participants to listen to syllable sequences that were 4 seconds long, with each syllable lasting for 0.25 seconds. I changed the number of levels of the linguistic structures that can be built from the syllable sequences, such as structures of two- or four syllables. At the same time, I changed whether the participants had the knowledge to do so: in other words, whether they spoke the language (Dutch and Chinese). I replicated the frequency effect. In particular, the rhythm of the presented structure was reflected in the participants’ neural activity: there would be a 1 Hz and 2 Hz neural response in the brain when participants listen to a syllable sequence in which one 4-syllable unit (1-sec) and two 2-syllable units (0.5-sec) can be built within one second. But what’s even more interesting is that we observed the frequency effect even when participants were not able to comprehend the meaning of the stimuli (for instance when Dutch participants listened to Mandarin Chinese speech)! That means they can segment speech with only information about which syllables often occur together.

What are the implications of this finding? How does this push science or society forward?
Our findings provided new evidence on the role of statistical information in speech segmentation. We showed that statistical information plays an important role in early language acquisition . More importantly, we revealed that neural activity can reflect the extraction of statistical structures, a process in which our knowledge of grammar is not necessarily involved. This new finding can guide future research on speech segmentation.

What do you want to do next?

In research, we often study a phenomenon by observing it, repeating it, or changing things to understand it better. In my studies on speech segmentation and syntactic structure discrimination, I merely reached the level of change. A straightforward next step is to find what factors can change the observed effects or finding the key factor(s) in which neural activity is mainly dependent on. In addition, we may need to use new analytical techniques in order to unveil deeper level information about how the brain processes spoken language.