Inter-Rater Reliability for the Judged Accentedness of English Bilinguals

Open Access
Sigmund, Rachel Marie
Area of Honors:
Bachelor of Science
Document Type:
Thesis Supervisors:
  • David A. Rosenbaum, Thesis Supervisor
  • Judith Fran Kroll, Faculty Reader
  • David A. Rosenbaum, Honors Advisor
  • bilinguals
  • speech perception
  • learning behaviors
  • skill-assessment
  • inter-rater reliability
This study examined the extent to which native English speakers and native Chinese-English bilinguals differ in their abilities to perceive accents in the spoken English of other native and non-native English speakers. The main empirical question was whether proficient speakers of a language would agree more when judging the accentedness of another person speaking that same language than would less proficient speakers. This question arose from the hypothesis that speech perception and language proficiency share common representations and that the acquisition of those representations constitutes a core component of language learning. The participants were two groups of 10 native-English speakers and two groups of 10 native-Chinese speakers of English each. All participants in the first group of native English speakers and in the first group of native Chinese speakers performed in four experiments. In Experiment 1, the participants indicated whether letter series were English words or nonwords. I used this lexical-decision task to assess the language proficiency of the two groups. In Experiment 2, participants responded to mathematical problems while trying to remember subsequent L1 words. I used this operations-span (O-span) task to assess working memory capacities. In Experiment 3, participants responded to the colors of stimuli while locations either were congruent or incongruent with respect to the responses that were made (on the same side as the response key or on the opposite side, respectively). I used this Simon task to assess executive functioning capabilities. Finally, in Experiment 4, one group of the 10 native English speakers and one group of the 10 native Chinese speakers recorded themselves speaking four sentences in English. Later, the other group of 10 native English speakers and the other group of 10 native Chinese speakers listened to these recordings and gave accentedness ratings for heard speaker’s rendition of each sentence. Experiment 1 revealed higher proficiency levels for native-English participants than native-Chinese participants. Experiment 2 showed that native-English participants recalled slightly more words than native-Chinese participants. Experiment 3 revealed a large range of scores for native-Chinese participants, and a smaller range for native-English participants. Native-Chinese and native-English raters gave equally high ratings to native-English speakers, with similar low means of agreement, which may reflect a ceiling effect. Native-Chinese raters gave higher ratings to the spoken English of the native-Chinese speakers than did native-English raters. The most important result of all was that native-Chinese raters had a lower mean agreement among their ratings for native-Chinese speakers than did native-English raters for the same speakers. This study suggests that inter-rater reliability is a useful metric for exposing differences in expertise. Using inter-rater reliability may provide a new method for probing the abilities of speakers and listeners of different languages and also for teaching new languages.