斯坦福大学CS231n——assignmentv1之KNN

开篇

前段时间在b站上看斯坦福大学cs231n计算机视觉的公开课，其实这门课并不是一门进阶课程，他只是一门基础的CV课程，如果大家想入门计算机视觉的可以去b站上搜一下这门课，简介中前序课程建议学习cs229和cs131,但本人亲测没学过这两门也关系不大，因为cs231n讲的还是很基础的(我学过cs229，就是吴恩达老师的机器学习，但是没上过cs131,但感觉影响和知识预备关联不是很大，cs231n实在很基础)。虽然课程很基础，但是它的三个大作业确实是备受好评，我之所以会每天再花费一个多小时去学一遍这门较为基础的入门课只是因为去做一遍它的官方作业而已，如果大家有兴趣可以和我一起做cs231n作业1。这是github上的作业要求和一些指导，我们先从作业1开始做起，我现在做到了作业2，未来会在博客上更新整套三个作业的所有代码。我们开始吧。

代码

import numpy as np


class KNearestNeighbor(object):
    """ a kNN classifier with L2 distance """

    def __init__(self):
        pass

    def train(self, X, y):
        """
    Train the classifier. For k-nearest neighbors this is just 
    memorizing the training data.
    Inputs:
    - X: A numpy array of shape (num_train, D) containing the training data
      consisting of num_train samples each of dimension D.
    - y: A numpy array of shape (N,) containing the training labels, where
         y[i] is the label for X[i].
    """
        self.X_train = X
        self.y_train = y

    def predict(self, X, k=1, num_loops=0):
        """
    Predict labels for test data using this classifier.
    Inputs:
    - X: A numpy array of shape (num_test, D) containing test data consisting
         of num_test samples each of dimension D.
    - k: The number of nearest neighbors that vote for the predicted labels.
    - num_loops: Determines which implementation to use to compute distances
      between training points and testing points.
    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i].
    """
        # 表示的是计算距离的方法
        if num_loops == 0:
            dists = self.compute_distances_no_loops(X)
        elif num_loops == 1:
            dists = self.compute_distances_one_loop(X)
        elif num_loops == 2:
            dists = self.compute_distances_two_loops(X)
        else:
            raise ValueError('Invalid value %d for num_loops' % num_loops)
        return self.predict_labels(dists, k=k)

    def compute_distances_two_loops(self, X):
        """
    Compute the distance between each test point in X and each training point
    in self.X_train using a nested loop over both the training data and the
    test data.

    Inputs:
    - X: A numpy array of shape (num_test, D) containing test data.
    Returns:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      is the Euclidean distance between the ith test point and the jth training
      point.
    """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        # 这是一个距离矩阵
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):
            for j in range(num_train):
                #####################################################################
                # TODO:                                                             #
                # Compute the l2 distance between the ith test point and the jth    #
                # training point, and store the result in dists[i, j]. You should   #
                # not use a loop over dimension.                                    #
                #####################################################################
                dists[i][j] = np.linalg.norm((X[i] - self.X_train[j]), ord=2)
                #####################################################################
                #                       END OF YOUR CODE                            #
                #####################################################################
        return dists

    def compute_distances_one_loop(self, X):
        """
    Compute the distance between each test point in X and each training point
    in self.X_train using a single loop over the test data.
    Input / Output: Same as compute_distances_two_loops
    """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):
            # Compute the l2 distance between the ith test point and all training 
            # points, and store the result in dists[i, :].                        
            # 有点儿向量化的意思
            dists[i] = np.linalg.norm(X[i] - self.X_train, ord=2, axis=1)
        return dists
    def compute_distances_no_loops(self, X):
        """
    Compute the distance between each test point in X and each training point
    in self.X_train using no explicit loops.
    Input / Output: Same as compute_distances_two_loops
    """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        #########################################################################
        # TODO:                                                                 #
        # Compute the l2 distance between all test points and all training      #
        # points without using any explicit loops, and store the result in      #
        # dists.                                                                #
        #                                                                       #
        # You should implement this function using only basic array operations; #
        # in particular you should not use functions from scipy.                #
        #                                                                       #
        # HINT: Try to formulate the l2 distance using matrix multiplication    #
        #       and two broadcast sums.                                         #
        #########################################################################
        # 广播特性进行相加
        # 由于我们的距离数组的行数是num_test,列数是num_train
        # 所以我们应该让X_train的维度是(1,num_train)
        # 而test的维度是(num_test,1)
        dists += np.sum(self.X_train ** 2, axis=1).reshape(1, num_train)
        dists += np.sum(X ** 2, axis=1).reshape(num_test, 1)  # reshape for broadcasting
        dists -= 2 * np.dot(X, self.X_train.T)
        dists = np.sqrt(dists)

        #########################################################################
        #                         END OF YOUR CODE                              #
        #########################################################################
        return dists
    def predict_labels(self, dists, k=1):
        """
    Given a matrix of distances between test points and training points,
    predict a label for each test point.
    Inputs:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      gives the distance betwen the ith test point and the jth training point.
    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i].
    """
        num_test = dists.shape[0]
        y_pred = np.zeros(num_test)
        for i in range(num_test):
            # A list of length k storing the labels of the k nearest neighbors to
            # the ith test point.
            closest_y = []
            #########################################################################
            # TODO:                                                                 #
            # Use the distance matrix to find the k nearest neighbors of the ith    #
            # testing point, and use self.y_train to find the labels of these       #
            # neighbors. Store these labels in closest_y.                           #
            # Hint: Look up the function numpy.argsort.                             #
            #########################################################################

            # 将距离从小到大进行排列,然后取前k的索引放入列表
            closest_y = self.y_train[np.argsort(dists[i])[: k]]

            #########################################################################
            # TODO:                                                                 #
            # Now that you have found the labels of the k nearest neighbors, you    #
            # need to find the most common label in the list closest_y of labels.   #
            # Store this label in y_pred[i]. Break ties by choosing the smaller     #
            # label.                                                                #
            #########################################################################
            # y_pred[i]是closest_y中出现最多的数字,即我们要找的那个类别
            y_pred[i] = np.argmax(np.bincount(closest_y))
            #########################################################################
            #                           END OF YOUR CODE                            #
            #########################################################################
        return y_pred

这里我们的主要代码就是这三个计算距离的函数，我们采取的计算公式都是欧氏距离，但是具体细节不同，一种是直接计算两个向量之间的距离，另一种是取测试集中的每一个样本，计算与训练集向量之间的距离；第三种是一个二维矩阵，分别计算测试集中每一个样本与训练集中每个样本的距离并填入矩阵中。

然后我们将距离从小到大排序，选取前k个最小距离，然后找出出现做多的类别标签即可。(y_train中存储着每个训练集的正确标签)。

这里的英文提示都是原版作业给出的提示，对我们理解算法以及填充代码很有帮助。建议大家做完作业以后再自己实现一遍，把除了核心算法的细节也弄清楚。

总结

KNN算法很简单所以实现起来很easy，明天我们说linear_classifier线性分类器，一种简单且基础的分类模型。

说明：这一系列博客和我介绍的tensorflow或者pytorch实现的文章是不同，用pytorch实现的模型是利用了框架，而这里是一种从底层开始的造轮子过程，可以看到，我们只引入numpy这个库。两种方法都是有效的，只不过从底层实现可以更好地理解算法。

大家可以先参考下Pytorch实现线性回归

斯坦福大学CS231n——assignmentv1之KNN

开篇

最近邻KNN算法

代码

总结

继续阅读

斯坦福大学CS231n——assignmentv1——softmax

CS231N-Lecture5 Training Neural Network

【cs231n】RNN LSTM

CS231n作业笔记2.6：卷积层以及池化层的实现CS231n简介作业笔记

CS231n assignment1 -- knn

CS231n作业笔记2.3：优化算法Momentum, RMSProp, AdamCS231n简介作业笔记

2020 cs231n 作业3 笔记 StyleTransfer-PyTorchStyleTransfer-PyTorc

2020 cs231n 作业2笔记 PyTorchPyTorch

2020 cs231n 作业2笔记 Convolutional NetworksConvolutional Networks

cs231n作业：Assignment2-Pytorch

CS231n:图像分类数据驱动方法K-最近邻算法线性分类器

CS231n knn python 课后作业knn程序predict_lable

cs231n课后作业（1）

cs231n作业——softmaxcs231n作业——softmax

cs231n斯坦福基于卷积神经网络的CV学习笔记（一）KNN和线性分类器/分类器损失/反向传播一，KNN图像分类算法二，线性分类器三，线性分类器损失四，反向传播五，神经网络