Python3入門機器學習 - 梯度下降法

梯度下降是疊代法的一種,可以用于求解最小二乘問題(線性和非線性都可以)。在求解機器學習算法的模型參數，即無限制優化問題時，梯度下降（Gradient Descent）是最常采用的方法之一，另一種常用的方法是最小二乘法。在求解損失函數的最小值時，可以通過梯度下降法來一步步的疊代求解，得到最小化的損失函數和模型參數值。

模拟實作梯度下降法

def DJ(theta):      //計算損失函數J的斜率
    return 2*(theta-2.5)
def J(theta):        //損失函數J，使用梯度下降法 求該函數極小值
    return (theta-2.5)**2+1

theta = 0.0
eta = 0.1
epsilon = 1e-8
theta_history = [theta]

while True:
    gradient = DJ(theta)
    last_theta = theta
    theta = theta - eta*gradient
    theta_history.append(theta)
    if(abs(J(theta) - J(last_theta))<epsilon):
        break

pyplot.plot(plot_x,plot_y)
pyplot.plot(np.array(theta_history),J(np.array(theta_history)),color='r',marker='+')

梯度下降法應用于線性回歸算法

def fit_gd(self,X_train,y_train,eta=0.01,n_iters=1e6):
        def J(theta,X_b,y):
            try:
                return np.sum((y-X_b.dot(theta))**2)/len(y)
            except:
                return float("inf")
        def dJ(theta,X_b,y):
            # res = np.empty()
            # res[0] = np.sum(X_b.dot(theta)-y)
            # for i in range(1,len(theta)):
            #     res[i] = (X_b.dot(theta)-y).dot(X_b[:,i])
            # return res * 2 / len(X_b)
            return X_b.T.dot(X_b.dot(theta)-y)*2./len(X_b)
        def gradient_descent(X_b,y,initial_theta,eta,n_iters=1e6,epsilon=1e-8):
            theta = initial_theta
            cur_iter = 0
            while cur_iter<n_iters:
                gradient = dJ(theta,X_b,y)
                last_theta = theta
                theta = theta - eta * gradient
                if (abs(J(theta,X_b,y) - J(last_theta,X_b,y)) < epsilon):
                    break
                cur_iter+=1
            return theta
        X_b = np.hstack([np.ones((len(X_train),1)),X_train])
        initial_theta = np.zeros(X_b.shape[1])
        self._theta = gradient_descent(X_b,y_train,initial_theta,eta,n_iters)
        self.interception_ = self._theta[0]
        self.coef_ = self._theta[1:]
        return self

随機梯度下降法

随機梯度下降法是在矩陣X_b中任選一行進行梯度下降，基于這種思想，每次下降具有很大的随機性，甚至損失函數有可能變大，但根據經驗，發現這種方法也可以較好的計算出最佳的損失函數值。

随機梯度下降法的超參數(模拟退火)

由于随機梯度下降法的不确定性，是以eta值需要根據每次遞歸的過程遞減，圖示即為常用的eta值遞減方案。

def dJ_sgd(theta,X_b_i,y_i):
    return X_b_i.T.dot(X_b_i.dot(theta)-y_i)*2.

def sgd(X_b,y, initial_theta,n_iters):
    t0 = 5.0
    t1 = 50.0
    
    def learning_theta(t):
        return t0/(t1+t)
    
    theta = initial_theta
    for cur_iter in range(n_iters):
        rand_i = np.random.randint(len(X_b))
        gradient = dJ_sgd(theta,X_b[rand_i],y[rand_i])
        theta = theta-learning_theta(cur_iter) * gradient
    return theta

使用sklearn中的随機梯度下降法

from sklearn.linear_model import SGDRegressor

sgd = SGDRegressor(n_iter=1000)
sgd.fit(X_train_standard,y_train)
sgd.score(X_test_standard,y_test)

梯度下降法的DEBUG

一般來說，梯度下降法需要對損失函數進行數學推導出他的導函數，但我們如何得知推導過程是否正确，或者說導函數是否正确呢，我們可以使用以下方法進行驗證

def dJ_debug(theta,X_b,y,epslion=0.01):
    res = np.empty(len(theta))
    for i in range(len(theta)):
        theta_1 = theta.copy()
        theta_1[i] += epslion
        theta_2 = theta.copy()
        theta_2[i] -= epslion
        res[i] = (J(theta_1,X_b,y)-J(theta_2,X_b,y)/(2*epslion))
    return res

使用兩個藍點的斜率來替代紅點的斜率，驗證斜率是否正确

Python3入門機器學習 - 梯度下降法

模拟實作梯度下降法

随機梯度下降法

梯度下降法的DEBUG

繼續閱讀

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

筆試面試題目：滑動視窗(二)

27. Remove Element(清單)題目代碼

資料結構與算法（27）——排序（二）

Dijkstra--簡易版（最短路徑）

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入

hdu7108哈希