Speeding up the scaled conjugate gradient algorithm and its application in neuro-fuzzy classifier training

Cetisli, Bayram; BARKANA, ATALAY

doi:10.1007/s00500-009-0410-8

Speeding up the scaled conjugate gradient algorithm and its application in neuro-fuzzy classifier training

Cetisli B., BARKANA A.

SOFT COMPUTING, cilt.14, sa.4, ss.365-378, 2010 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 14 Sayı: 4
Basım Tarihi: 2010
Doi Numarası: 10.1007/s00500-009-0410-8
Dergi Adı: SOFT COMPUTING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.365-378
Anahtar Kelimeler: Speeding up learning, Gradient estimation, The scaled conjugate gradient algorithm, Neuro-fuzzy classifier, Neural network, Large-scale problems, QUASI-NEWTON METHODS, NETWORKS, NOISE, ENSEMBLE, SYSTEMS
Anadolu Üniversitesi Adresli: Evet

Özet

The aim of this study is to speed up the scaled conjugate gradient (SCG) algorithm by shortening the training time per iteration. The SCG algorithm, which is a supervised learning algorithm for network-based methods, is generally used to solve large-scale problems. It is well known that SCG computes the second-order information from the two first-order gradients of the parameters by using all the training datasets. In this case, the computation cost of the SCG algorithm per iteration is more expensive for large-scale problems. In this study, one of the first-order gradients is estimated from the previously calculated gradients without using the training dataset. To estimate this gradient, a least square error estimator is applied. The estimation complexity of the gradient is much smaller than the computation complexity of the gradient for large-scale problems, because the gradient estimation is independent of the size of dataset. The proposed algorithm is applied to the neuro-fuzzy classifier and the neural network training. The theoretical basis for the algorithm is provided, and its performance is illustrated by its application to several examples in which it is compared with several training algorithms and well-known datasets. The empirical results indicate that the proposed algorithm is quicker per iteration time than the SCG. The algorithm decreases the training time by 20-50% compared to SCG; moreover, the convergence rate of the proposed algorithm is similar to SCG.