Document Embedding Approach for Efficient Authorship Attribution


2nd International Conference on Knowledge Engineering and Applications (ICKEA), London, Canada, 21 - 23 October 2017, pp.194-198 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/ickea.2017.8169928
  • City: London
  • Country: Canada
  • Page Numbers: pp.194-198
  • Keywords: authorship attribution, document embeddings, bag of words model, text classification
  • Anadolu University Affiliated: Yes


Authorship attribution has been well studied in terms of text classification with many diverse feature sets. However, finding topic independent features is hard and trained models with hand crafted features in one domain may not work in another domain. In this study we used a semi supervised neural language model which is known as document embeddings for authorship attribution problem. This method showed significant improvements over bag-of-words representations in a well-known dataset.