Document Embedding Approach for Efficient Authorship Attribution


Agun H. V., YILMAZEL Ö.

2nd International Conference on Knowledge Engineering and Applications (ICKEA), London, Kanada, 21 - 23 Ekim 2017, ss.194-198 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/ickea.2017.8169928
  • Basıldığı Şehir: London
  • Basıldığı Ülke: Kanada
  • Sayfa Sayıları: ss.194-198
  • Anahtar Kelimeler: authorship attribution, document embeddings, bag of words model, text classification
  • Anadolu Üniversitesi Adresli: Evet

Özet

Authorship attribution has been well studied in terms of text classification with many diverse feature sets. However, finding topic independent features is hard and trained models with hand crafted features in one domain may not work in another domain. In this study we used a semi supervised neural language model which is known as document embeddings for authorship attribution problem. This method showed significant improvements over bag-of-words representations in a well-known dataset.