機器學習一（梯度下降法）

課件和視訊位址

http://cs229.stanford.edu/notes/cs229-notes1.pdf

http://open.163.com/movie/2008/1/M/C/M6SGF6VB4_M6SGHFBMC.html

1.前言

最近偶觸python，感ctrl c和ctrl v無比順暢，故越發膨脹。怒拾起python資料分析一PDF讀之，不到百頁，内心惶恐，歎：卧槽，這都tm是啥，甚是迷茫。遂感基礎知識薄弱，随意搜了機器學習教程，小看一翻。此文給出課件中幾個算法，自己都不知道對不對，感覺還可以吧。

2.環境配置

不多說，用的python3.x，numpy包，環境下載下傳pycharm，然後file->setting->Project Interpreter->右側綠色+号->搜尋輸入numpy->install，然後可能有報錯日志，根據日志循環上述過程安裝缺少的包。

3.求解問題

本文以線性回歸為例，在給出若幹（x, y）下，找到方程y=b+ax中b和a，進而給出線性方程。具體理論實在了解尚淺，隻給出求解公式。下面代碼放在一個python檔案中即可。代碼部分用到矩陣和向量乘法，作為求和，有一個numpy符号*、dot、multipy差別寫在末尾。

具體公式如下（課件抄的），已知樣本符合如下線性方程：

h(x)=∑i=0nθixi=θTx,(x0=1)

求解下面cost function最小值時， θ 的值，i是樣本編号，m是樣本總數

J(θ)=12∑i=1m(hθ(x(i))−y(i))2

轉換為梯度下降法求解如下公式：

θj=θj−α∂∂θjJ(θ)

∂∂θjJ(θ)=(hθ(x)−y)xj

(1)批量梯度下降算法(batch gradient descent)

僞代碼如下

Repeat until convergence{

θj=θj+α∑i=1m(y(i)−hθ(x(i)))x(i)j，(for every j)

}

import numpy as np

# y=theta*x
def h(theta, example_x):
    return theta * example_x.T

def batch_gradient_descent(x, y, theta0, alpha, iterator):
    example_x = np.matrix(x)
    label_y = np.matrix(y)
    theta = np.matrix(theta0, dtype=float)
    for i in range(iterator):
        error = (label_y - h(theta, example_x))
        sum_gradient = error * example_x
        theta = theta + alpha * sum_gradient
    return theta

（2）随機梯度下降算法（stochastic gradient descent）

僞代碼如下

loop{

　for i=1 to m{

θj=θj+α(y(i)−hθ(x(i)))x(i)j，(for every j)

　}

代碼如下

def stochastic_gradient_descent(x, y, theta0, alpha, iterator):
    example_x = np.matrix(x)
    label_y = np.matrix(y)
    theta = np.matrix(theta0, dtype=float)
    m, n = np.shape(example_x)
    for i in range(iterator):
        for j in range(m):
            gradient = (label_y[, j] - h(theta, example_x[j])) * example_x[j]
            theta = theta + alpha * gradient
    return theta

（3）locally weighted linear regression algorithm

這個也不知道怎麼翻譯，就是帶權重的線性回歸，求解方法也就叫權重梯度下降法吧。這個自己不知道對不對，課件沒給出具體步驟，也沒搜到具體内容。感覺和上兩個算法也不該放一起比較，場景不太一緻。這就放一起吧。

求解公式，按照下面公式在（2）上加了個w，算法步驟與（2）一樣

Fit θ to minimize∑iw(i)(y(i)−θTx(i))2

w(i)=exp(−(x(i)−x)22τ2)

上公式中x看做所有樣本X每列的平均值，暫時這樣處理吧。

代碼如下：

def w(xi, ex, t):
    return np.exp(-np.multiply((xi - ex), (xi - ex))/*t*t)

# locally weighted linear regression algorithm
def locally_gradient_descent(x, y, theta0, alpha, iterator, t):
    example_x = np.matrix(x)
    label_y = np.matrix(y)
    theta = np.matrix(theta0, dtype=float)
    m, n = np.shape(example_x)
    ex = np.mean(example_x, axis=)
    for i in range(iterator):
        for j in range(m):
            wj = w(example_x[j], ex, t)
            gradient = np.multiply(wj, (label_y[, j] - h(theta, example_x[j])) * example_x[j])
            theta = theta + alpha * gradient
    return theta

注意：numpy的matrix，負号*與dot是一樣的，都表示矩陣乘法,行列對應一緻。multiply是矩陣各對應位置相乘。例：[1,2]*[[1],[2]]=numpy.dot([1,2],[[1],[2]])=[5]，numpy.multiply([1,2], [[1],[2]])=[[1,2],[2,4]]

4測試結果

資料明顯給出y=1+2x

x = [[1, 1], [1, 2], [1, 3]]
y = [, , ]
theta0 = [, ]
print(batch_gradient_descent(x=x, y=y, theta0=theta0, alpha=, iterator=))
print(stochastic_gradient_descent(x=x, y=y, theta0=theta0, alpha=, iterator=))
print(locally_gradient_descent(x=x, y=y, theta0=theta0, alpha=, iterator=, t=))

求解結果如下（ θ0和θ1 ），是不是很像1，2

[[ 1.0748101 1.96709091]]

[[ 1.04498802 1.98500399]]

[[ 0.99706172 2.0013277 ]]

機器學習一（梯度下降法）

課件和視訊位址

1.前言

2.環境配置

3.求解問題

(1)批量梯度下降算法(batch gradient descent)

（2）随機梯度下降算法（stochastic gradient descent）

（3）locally weighted linear regression algorithm

4測試結果

繼續閱讀

Command Network(POJ 3164)---定根最小樹形圖模闆題題目描述輸入格式輸出格式輸入樣例輸出樣例分析源程式

開源低帶寬語音編解碼器

241 Different Ways to Add Parentheses（C代碼版）

【趨高機器視覺】機器視覺技術原了解析及解決方案

吳恩達 coursera ML 第七課總結+作業答案前言目錄正文模型表示作業答案

CSMA/CD1． CSMA/CD的概述2． CSMA 的工作原理3． CSMA/CD控制規程及特點4． CSMA/CD協定5． CSMA/CD的優點6．結束語

XGBoost Plotting API以及GBDT組合特征實踐 XGBoost Plotting API以及GBDT組合特征實踐

極大似然法(ML)與最大期望法(EM)

解碼器用于語義分割：資料依賴的解碼可以實作靈活的特征聚合

2021-2025年中國運動療法（KT）帶行業市場供需與戰略研究報告

C++ 第十五周報告1--《冒泡法排序》

筆試面試題目：滑動視窗(二)

資料結構與算法（27）——排序（二）

Dijkstra--簡易版（最短路徑）

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

hdu7108哈希