A scalable privacy-preserving recommendation scheme via bisecting k-means clustering

BİLGE, ALPER; Polat, Huseyin

doi:10.1016/j.ipm.2013.02.004

A scalable privacy-preserving recommendation scheme via bisecting k-means clustering

Atıf İçin Kopyala

BİLGE A., Polat H.

INFORMATION PROCESSING & MANAGEMENT, cilt.49, sa.4, ss.912-927, 2013 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 49 Sayı: 4
Basım Tarihi: 2013
Doi Numarası: 10.1016/j.ipm.2013.02.004
Dergi Adı: INFORMATION PROCESSING & MANAGEMENT
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Scopus
Sayfa Sayıları: ss.912-927
Anahtar Kelimeler: Accuracy, Binary decision diagrams, Clustering methods, Data preprocessing, Data privacy, Recommender systems, SYSTEMS
Anadolu Üniversitesi Adresli: Evet

Özet

Privacy-preserving collaborative filtering is an emerging web-adaptation tool to cope with information overload problem without jeopardizing individuals' privacy. However, collaborative filtering with privacy schemes commonly suffer from scalability and sparseness as the content in the domain proliferates. Moreover, applying privacy measures causes a distortion in collected data, which in turn defects accuracy of such systems. In this work, we propose a novel privacy-preserving collaborative filtering scheme based on bisecting k-means clustering in which we apply two preprocessing methods. The first preprocessing scheme deals with scalability problem by constructing a binary decision tree through a bisecting k-means clustering approach while the second produces clones of users by inserting pseudo-self-predictions into original user profiles to boost accuracy of scalability-enhanced structure. Sparse nature of collections are handled by transforming ratings into item features-based profiles. After analyzing our scheme with respect to privacy and supplementary costs, we perform experiments on benchmark data sets to evaluate it in terms of accuracy and online performance. Our empirical outcomes verify that combined effects of the proposed preprocessing schemes relieve scalability and augment accuracy significantly. (C) 2013 Elsevier Ltd. All rights reserved.