Perception of allophonic variations in American English by Japanese learners: Flaps are less favored than stops

English Studies

Studies on second-language (L2) speech learning have often focused on L2 learners’ production and perception of sounds that signal lexical distinctions in L2, e.g. [1].  Much less attention has been given to sounds that do not signal lexical contrasts.  However, learning to produce and perceive non-contrastive sounds could be important for L2 learners, particularly if they want to achieve native-level performance. In American English, intervocalic alveolar stops are realized as alveolar flaps, e.g. better, rider, get up, need it.  Alveolar flaps are allophonic variants of /t/ and /d/ and do not signal lexical contrasts with other sounds in English.

The present line of research has focused on the production and perception of AE alveolar flaps by Japanese learners of English (JE learners).  The Japanese language has a single liquid consonant, which is often described as an alveolar flap, e.g. [2][3][4][5][6].  This provides an interesting scenario for studying the effect of learners’ first language (L1) on L2 learning.  That is, Japanese speakers produce and perceive alveolar flaps in the context of their L1, i.e. as a phonetic realization of the Japanese liquid consonant.  If the ability to produce and perceive alveolar flaps positively transfers to L2, then JE learners might be expected to produce and perceive alveolar flaps appropriately in English.  However, since alveolar flaps are allophonic variants of /t/ and /d/ in AE, not a phonetic realization of /r/ as in Japanese, JE learners may have difficulty appropriately associating alveolar flaps with the correct sounds in English.   A lexical decision experiment was conducted with Japanese learners of English (JE) to investigate whether second-language (L2) learners are sensitive to such allophonic variations when recognizing words in L2.  The stimuli consisted of 36 isolated disyllabic English words containing word-medial /t/, half of which were flap-favored words, e.g. city, and the other half were [t]-favored words, e.g. faster. All stimuli were recorded with two surface forms: /t/ as a flap, e.g. city with a flap, or as [t], e.g. city with [t].  The stimuli were counterbalanced so that participants only heard one of the two surface forms of each word.  The accuracy data indicated that flap-favored words pronounced with a flap, e.g. city with a flap, were recognized significantly less accurately than flap-favored words with [t], e.g. city with [t], and [t]-favored words with [t], e.g. faster with [t].  These results suggest that JE learners prefer canonical forms over frequent forms produced with context-dependent allophonic variations.  These results are inconsistent with previous studies that found native speakers’ preference for frequent forms, and highlight differences in the effect of allophonic variations on the perception of native-language and L2 speech.

The present paper is based on a collaboration work with Prof. Keiichi Tajima at Hosei University and Prof. Kiyoko Yoneyama at Daito Bunka University. We thank all the participants of our experiments. A part of this paper was presented at the 5th joint meeting of ASA and ASJ in Hawaii, 2016 and InterSpeech 2017 in Sweden. This work was supported by JSPS-Kakenhi grant # 16K02646, 15K02492, and 26370508.

