On constituent chunking for Turkish

ASLAN, ÖZKAN; GÜNAL, SERKAN; DİNÇER, BEKİR

doi:10.1016/j.ipm.2018.05.004

On constituent chunking for Turkish

ASLAN Ö., GÜNAL S., DİNÇER B. T.

INFORMATION PROCESSING & MANAGEMENT, cilt.54, sa.6, ss.1262-1276, 2018 (SCI-Expanded, SSCI, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 54 Sayı: 6
Basım Tarihi: 2018
Doi Numarası: 10.1016/j.ipm.2018.05.004
Dergi Adı: INFORMATION PROCESSING & MANAGEMENT
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Scopus
Sayfa Sayıları: ss.1262-1276
Anahtar Kelimeler: Chunking, Shallow parsing, Turkish, Constituent Conditional random fields, Natural language processing, PARSER
Anadolu Üniversitesi Adresli: Hayır

Özet

Chunking is a task which divides a sentence into non-recursive structures. The primary aim is to specify chunk boundaries and classes. Although chunking generally refers to simple chunks, it is possible to customize the concept. A simple chunk is a small structure, such as a noun phrase, while constituent chunk is a structure that functions as a single unit in a sentence, such as a subject. For an agglutinative language with a rich morphology, constituent chunking is a significant problem in comparison to simple chunking. Most of Turkish studies on this issue use the IOB tagging schema to mark the boundaries.