一.帶有随機隐藏節點的單隐層前饋神經網絡
1.相關條件:
- N個不同樣本( xi,ti ), xi = [xi1,xi2,xi3,........,xin]T , ti = [ti1,ti2,ti3,........,tim]T
- 第i個隐藏節點和輸入節點間的權重向量: wi=[wi1,wi2,........win]T
- 第i個隐藏節點的閥值: bi=[bi1,bi2,........bin]T
- 第i個隐藏節點和輸出節點間的權重向量: βi=[βi1,βi2,........βin]T
- 激活函數:g(x)
2.方程改寫:
- SLFNs(單隐層前饋神經網絡) : ∑N‘i=1βig(wi∗xj+bi) = tj , 簡寫成:H β = T
- 其中
機器學習___ELM
3.簡易模型:
4.兩個相關定理(可以自行證明):
- 定理 1: 給定一個具有N個隐藏節點以及在任何區間都無限可導的激活函數的标準SLFN。對N個任意不同樣本,,SLFN在随機産生的情況下,形成的隐藏層輸出矩陣H是可逆的,且
- 定理 2. 對于任意小的,及在任何區間都無限可導的激活函數,對N個任意不同樣本,,總存在個隐節點的SLFN,使得在随機産生的情況下。
二.SLFNs的最小範數的最小二乘(LS)
- 由定理1,2可知:隻要激活函數無限可導,輸入權重和隐藏層閥值可以随機配置設定, (即:可認為wi,βi已知),是以訓練SLFNs等價于找到H β = T 的一個最小二乘解 β∗
- 其中根據相容方程組: β∗=H+T ( H+ 是H的廣義逆)
-
三.ELM算法
給定訓練樣本集合N個不同樣本( xi,ti ),激活函數g(x)和隐藏單元個數 N∗- 任意指定輸入權值和門檻值 wi,bi(i=1,....N∗) ;
- 計算隐藏層輸出矩陣H;
-
計算輸出權重 β : β∗=H+T
其中 T=[t1,t2,.......tN]T
四.代碼
function [TrainingTime, TestingTime, TrainingAccuracy, TestingAccuracy] = elm(TrainingData_File, TestingData_File, Elm_Type, NumberofHiddenNeurons, ActivationFunction) % Usage: elm(TrainingData_File, TestingData_File, Elm_Type, NumberofHiddenNeurons, ActivationFunction) % OR: [TrainingTime, TestingTime, TrainingAccuracy, TestingAccuracy] = elm(TrainingData_File, TestingData_File, Elm_Type, NumberofHiddenNeurons, ActivationFunction) % % Input: % TrainingData_File - Filename of training data set % TestingData_File - Filename of testing data set % Elm_Type - 0 for regression; 1 for (both binary and multi-classes) classification % NumberofHiddenNeurons - Number of hidden neurons assigned to the ELM % ActivationFunction - Type of activation function: % 'sig' for Sigmoidal function % 'sin' for Sine function % 'hardlim' for Hardlim function % 'tribas' for Triangular basis function % 'radbas' for Radial basis function (for additive type of SLFNs instead of RBF type of SLFNs) % % Output: % TrainingTime - Time (seconds) spent on training ELM % TestingTime - Time (seconds) spent on predicting ALL testing data % TrainingAccuracy - Training accuracy: % RMSE for regression or correct classification rate for classification % TestingAccuracy - Testing accuracy: % RMSE for regression or correct classification rate for classification % % MULTI-CLASSE CLASSIFICATION: NUMBER OF OUTPUT NEURONS WILL BE AUTOMATICALLY SET EQUAL TO NUMBER OF CLASSES % FOR EXAMPLE, if there are 7 classes in all, there will have 7 output % neurons; neuron 5 has the highest output means input belongs to 5-th class % % Sample1 regression: [TrainingTime, TestingTime, TrainingAccuracy, TestingAccuracy] = elm('sinc_train', 'sinc_test', 0, 20, 'sig') % Sample2 classification: elm('diabetes_train', 'diabetes_test', 1, 20, 'sig') % %%%% Authors: MR QIN-YU ZHU AND DR GUANG-BIN HUANG %%%% NANYANG TECHNOLOGICAL UNIVERSITY, SINGAPORE %%%% EMAIL: [email protected]; [email protected] %%%% WEBSITE: http://www.ntu.edu.sg/eee/icis/cv/egbhuang.htm %%%% DATE: APRIL 2004 %%%%%%%%%%% Macro definition REGRESSION=; CLASSIFIER=; %%%%%%%%%%% Load training dataset train_data=load(TrainingData_File); T=train_data(:,)'; P=train_data(:,2:size(train_data,2))'; clear train_data; % Release raw training data array %%%%%%%%%%% Load testing dataset test_data=load(TestingData_File); TV.T=test_data(:,)'; TV.P=test_data(:,2:size(test_data,2))'; clear test_data; % Release raw testing data array NumberofTrainingData=size(P,); NumberofTestingData=size(TV.P,); NumberofInputNeurons=size(P,); if Elm_Type~=REGRESSION %%%%%%%%%%%% Preprocessing the data of classification sorted_target=sort(cat(,T,TV.T),); label=zeros(,); % Find and save in 'label' class label from training and testing data sets label(,)=sorted_target(,); j=; for i = :(NumberofTrainingData+NumberofTestingData) if sorted_target(,i) ~= label(,j) j=j+; label(,j) = sorted_target(,i); end end number_class=j; NumberofOutputNeurons=number_class; %%%%%%%%%% Processing the targets of training temp_T=zeros(NumberofOutputNeurons, NumberofTrainingData); for i = :NumberofTrainingData for j = :number_class if label(,j) == T(,i) break; end end temp_T(j,i)=; end T=temp_T*-; %%%%%%%%%% Processing the targets of testing temp_TV_T=zeros(NumberofOutputNeurons, NumberofTestingData); for i = :NumberofTestingData for j = :number_class if label(,j) == TV.T(,i) break; end end temp_TV_T(j,i)=; end TV.T=temp_TV_T*-; end % end if of Elm_Type %%%%%%%%%%% Calculate weights & biases start_time_train=cputime; %%%%%%%%%%% Random generate input weights InputWeight (w_i) and biases BiasofHiddenNeurons (b_i) of hidden neurons InputWeight=rand(NumberofHiddenNeurons,NumberofInputNeurons)*-; BiasofHiddenNeurons=rand(NumberofHiddenNeurons,); tempH=InputWeight*P; clear P; % Release input of training data ind=ones(,NumberofTrainingData); BiasMatrix=BiasofHiddenNeurons(:,ind); % Extend the bias matrix BiasofHiddenNeurons to match the demention of H tempH=tempH+BiasMatrix; %%%%%%%%%%% Calculate hidden neuron output matrix H switch lower(ActivationFunction) case {'sig','sigmoid'} %%%%%%%% Sigmoid H = ./ ( + exp(-tempH)); case {'sin','sine'} %%%%%%%% Sine H = sin(tempH); case {'hardlim'} %%%%%%%% Hard Limit H = double(hardlim(tempH)); case {'tribas'} %%%%%%%% Triangular basis function H = tribas(tempH); case {'radbas'} %%%%%%%% Radial basis function H = radbas(tempH); %%%%%%%% More activation functions can be added here end clear tempH; % Release the temparary array for calculation of hidden neuron output matrix H %%%%%%%%%%% Calculate output weights OutputWeight (beta_i) OutputWeight=pinv(H') * T'; % slower implementation %OutputWeight=inv(eye(size(H,1))/C+H * H') * H * T'; % faster method 1 %implementation; one can set regularizaiton factor C properly in classification applications %OutputWeight=(eye(size(H,1))/C+H * H') \ H * T'; % faster method 2 %implementation; one can set regularizaiton factor C properly in classification applications %If you use faster methods or kernel method, PLEASE CITE in your paper properly: %Guang-Bin Huang, Hongming Zhou, Xiaojian Ding, and Rui Zhang, "Extreme Learning Machine for Regression and Multi-Class Classification," submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, October 2010. end_time_train=cputime; TrainingTime=end_time_train-start_time_train % Calculate CPU time (seconds) spent for training ELM %%%%%%%%%%% Calculate the training accuracy Y=(H' * OutputWeight)'; % Y: the actual output of the training data if Elm_Type == REGRESSION TrainingAccuracy=sqrt(mse(T - Y)) % Calculate training accuracy (RMSE) for regression case end clear H; %%%%%%%%%%% Calculate the output of testing input start_time_test=cputime; tempH_test=InputWeight*TV.P; clear TV.P; % Release input of testing data ind=ones(1,NumberofTestingData); BiasMatrix=BiasofHiddenNeurons(:,ind); % Extend the bias matrix BiasofHiddenNeurons to match the demention of H tempH_test=tempH_test + BiasMatrix; switch lower(ActivationFunction) case {'sig','sigmoid'} %%%%%%%% Sigmoid H_test = ./ ( + exp(-tempH_test)); case {'sin','sine'} %%%%%%%% Sine H_test = sin(tempH_test); case {'hardlim'} %%%%%%%% Hard Limit H_test = hardlim(tempH_test); case {'tribas'} %%%%%%%% Triangular basis function H_test = tribas(tempH_test); case {'radbas'} %%%%%%%% Radial basis function H_test = radbas(tempH_test); %%%%%%%% More activation functions can be added here end TY=(H_test' * OutputWeight)'; % TY: the actual output of the testing data end_time_test=cputime; TestingTime=end_time_test-start_time_test % Calculate CPU time (seconds) spent by ELM predicting the whole testing data if Elm_Type == REGRESSION TestingAccuracy=sqrt(mse(TV.T - TY)) % Calculate testing accuracy (RMSE) for regression case end if Elm_Type == CLASSIFIER %%%%%%%%%% Calculate training & testing classification accuracy MissClassificationRate_Training=0; MissClassificationRate_Testing=0; for i = 1 : size(T, 2) [x, label_index_expected]=max(T(:,i)); [x, label_index_actual]=max(Y(:,i)); if label_index_actual~=label_index_expected MissClassificationRate_Training=MissClassificationRate_Training+1; end end TrainingAccuracy=1-MissClassificationRate_Training/size(T,2) for i = 1 : size(TV.T, 2) [x, label_index_expected]=max(TV.T(:,i)); [x, label_index_actual]=max(TY(:,i)); if label_index_actual~=label_index_expected MissClassificationRate_Testing=MissClassificationRate_Testing+1; end end TestingAccuracy=1-MissClassificationRate_Testing/size(TV.T,2) end