Introducing TURLEC: A Learner Corpus for L2 Turkish


Savuran Y., Wullf S.

INTERNATIONAL JOURNAL OF CORPUS LINGUISTICS, cilt.0, sa.0, 2025 (AHCI)

  • Yayın Türü: Makale / Kısa Makale
  • Cilt numarası: 0 Sayı: 0
  • Basım Tarihi: 2025
  • Dergi Adı: INTERNATIONAL JOURNAL OF CORPUS LINGUISTICS
  • Derginin Tarandığı İndeksler: Arts and Humanities Citation Index (AHCI), Social Sciences Citation Index (SSCI), Scopus, Academic Search Premier, IBZ Online, Communication & Mass Media Index, Linguistic Bibliography, Linguistics & Language Behavior Abstracts, MLA - Modern Language Association Database, DIALNET
  • Anadolu Üniversitesi Adresli: Evet

Özet

This paper provides a detailed account of the Turkish Learner Corpus (TURLEC).

Building on the first author’s doctoral dissertation project, which aimed to identify

proficiency descriptors for four skills (listening, reading, writing, and speaking) for learners of

Turkish as a second language (L2) at various CEFR levels, the main motivation to build a

learner corpus is to outline the language learners actually use at different proficiency levels.

With the written and spoken texts of learners of Turkish L2 at the university level coming

from various countries with numerous L1 backgrounds, TURLEC entails 735 texts and

~104,000 tokens. After rigorous anonymization, annotation, and error-tagging efforts,

TURLEC reveals ~18,000 word forms with 3,584 lemmas, which will further be profiled

based on the CEFR levels. As accessible literature indicates, TURLEC is the first learner

corpus built to offer a vocabulary profile for L2 Turkish, which is an ever-growing field of

study with an increasing number of students.