PRIVACY-PRESERVING RANDOM PROJECTION-BASED RECOMMENDATIONS BASED ON DISTRIBUTED DATA

KALELİ, CİHAN; Polat, Huseyin

doi:10.1142/s0219622013500090

PRIVACY-PRESERVING RANDOM PROJECTION-BASED RECOMMENDATIONS BASED ON DISTRIBUTED DATA

Atıf İçin Kopyala

KALELİ C., Polat H.

INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, cilt.12, sa.2, ss.201-232, 2013 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 12 Sayı: 2
Basım Tarihi: 2013
Doi Numarası: 10.1142/s0219622013500090
Dergi Adı: INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.201-232
Anahtar Kelimeler: Privacy, random projection, distributed data, recommendation, performance
Anadolu Üniversitesi Adresli: Evet

Özet

Providing recommendations based on distributed data has received an increasing amount of attention because it offers several advantages. Online vendors who face problems caused by a limited amount of available data want to offer predictions based on distributed data collaboratively because they can surmount problems such as cold start, limited coverage, and unsatisfactory accuracy through partnerships. It is relatively easy to produce referrals based on distributed data when privacy is not a concern. However, concerns regarding the protection of private data, financial fears due to revealing valuable assets, and legal regulations imposed by various organizations prevent companies from forming collaborations. In this study, we propose to use random projection to protect online vendors' privacy while still providing accurate predictions from distributed data without sacrificing online performance. We utilize random projection to eliminate the aforementioned issues so vendors can work in partnerships. We suggest privacy-preserving schemes to offer recommendations based on vertically or horizontally partitioned data among multiple companies. The recommended methods are analyzed in terms of confidentiality. We also analyze the superfluous loads caused by privacy concerns. Finally, we perform real data-based trials to evaluate the accuracy of the proposed schemes. The results of our analyses show that our methods preserve privacy, cause insignificant overheads, and offer accurate predictions.