🤖 AI Summary
This study investigates the capacity of purely text-based large language models (LLMs) to perform phonological reasoning tasks—including rhyme generation, grapheme-to-phoneme conversion, and syllable counting—despite their lack of explicit phonetic representations. To address this limitation, we propose Participatory Chain-of-Thought (P-CoT) prompting, a pedagogically grounded method integrating scaffolding instruction and discovery learning. P-CoT guides models through structured, educationally informed reasoning steps to elicit latent phonological processing capabilities. We evaluate 12 LLMs under few-shot settings on the PhonologyBench benchmark. Results show that P-CoT yields an average performance gain of 52% across tasks, with several surpassing human baseline accuracy. This work constitutes the first systematic demonstration that text-only LLMs possess latent, elicitable phonological reasoning abilities. It establishes a novel paradigm for modeling phonological cognition in language models and advances applications in educational AI.
📝 Abstract
This study explores the potential of phonological reasoning within text-based large language models (LLMs). Utilizing the PhonologyBench benchmark, we assess tasks like rhyme word generation, g2p conversion, and syllable counting. Our evaluations across 12 LLMs reveal that while few-shot learning offers inconsistent gains, the introduction of a novel Pedagogically-motivated Participatory Chain-of-Thought (P-CoT) prompt, which is anchored in educational theories like scaffolding and discovery learning, consistently enhances performance. This method leverages structured guidance to activate latent phonological abilities, achieving up to 52% improvement and even surpassing human baselines in certain tasks. Future work could aim to optimize P-CoT prompts for specific models or explore their application across different linguistic domains.