Supplementary Material for:

McMurray, B., and Aslin, R.N. (in press) Infants are sensitive to within-category variation in speech perception. Cognition.

Abstract

Previous research on speech perception in both adults and infants has supported the view that consonants are perceived categorically; that is, listeners are relatively insensitive to variation below the level of the phoneme. More recent work, on the other hand, has shown adults to be systematically sensitive to within category variation (McMurray, Tanenhaus & Aslin, 2002). Additionally, recent evidence suggests that infants are capable of using within-category variation to segment speech and to learn phonetic categories. Here we report two studies of 8-month-old infants, using the head-turn preference procedure, that examine more directly infants’ sensitivity to within-category variation. Infants were exposed to 80 repetitions of words beginning with either /b/ or /p/. After exposure, listening times to tokens of the same category with small variations in VOT were significantly different than to both the originally exposed tokens and to the cross–category-boundary competitors. Thus infants, like adults, show systematic sensitivity to fine-grained, within-category detail in speech perception.

A preprint of the article can in MSWord Format can be obtained here.

Methods Supplement

Stimuli were constructed via progressive cross splicing. Six 9 step VOT continua were constructed: beach/peach, bale/pale, bear/pear, butter/putter, bomb/palm, and bump/pump. However, only four continua were used in the current experiment (butter/putter and bump/pump were excluded). This took place via the following procedure.

  1. Several recordings were made of each endpoint of each continuum. Each word was spoken by an adult male in a quiet room using a Kay Elemetrics 4300b speech lab and a head mounted microphone.
  2. Pairs of endpoints were selected that best matched in formant frequencies and pitch.
  3. The entire set of endpoints (8 words) were normalized to the same RMS amplitude.
  4. 9 splice points were identified in each of the voiced tokens. The first splice point was 1-3 ms into the sound (usually the first zero-crossing). Subsequent splice points were at approximately 5 ms increments, and always at zero-crossings.
  5. 9 corresponding splice points were identified in the voiceless tokens. Again, splice points were at approximately 5 ms increments (except the first one) and always at zero crossings.
  6. To create each continuum step, material was removed from the voiced endpoint, starting at the beginning of the sound and ending at the current splice point.
  7. Corresponding material from the voiceless sound (from the beginning of the sound to the splice point) was cut and inserted into the voiced file.
A zip file containing the complete stimulus set is available here. Please contact the author before using in any experiment.