The Problem of Surface-Level Fluency in AI-Generated Texts: A Discourse-Based Method to Evaluate Second Language Writing
DOI:
https://doi.org/10.65417/ljcas.v4i1.314Keywords:
AI-generated text, second language writing, discourse analysis, writing evaluation, coherence, cohesion, automated writing evaluationAbstract
AI tools can produce writing that looks smooth at first glance. The sentences are neat. The grammar is often clean. The wording may sound mature. Yet strong surface fluency does not always mean strong writing. This problem matters in second language writing because teachers, raters, and automated systems may reward polish more than meaning. The present paper addresses that risk through a discourse-based method for evaluating second language writing. The paper draws on public theory, public corpora, and published experiments. Key sources include Halliday and Hasan’s work on cohesion, Coh-Metrix and TAACO, TOEFL11, the Write & Improve Corpus 2024, ASAP 2.0, GPT-based scoring studies, and the DECOR benchmark. The paper argues that AI-generated texts often compress local errors while leaving wider discourse limits in place. These limits include weak task focus, thin support, unstable topic movement, and shallow conclusions. In response, the paper proposes a Surface-to-Discourse Method. The method separates surface fluency from discourse depth, scores both layers, and then computes a Fluency-Discourse Gap. A small gap signals balance. A large gap signals polished but thin writing. The paper also shows how the method can guide feedback, revision, and fairer classroom judgment. The main claim is simple. Smooth language should not be treated as a full sign of writing quality. In second language settings, whole-text meaning still needs close reading.
