cs231n課程作業(yè)assignment1(KNN)

前言:


以斯坦福cs231n課程的python編程任務(wù)為主線,展開對該課程主要內(nèi)容的理解和部分?jǐn)?shù)學(xué)推導(dǎo)。
該課程相關(guān)筆記參考自知乎-CS231n官方筆記授權(quán)翻譯總集篇發(fā)布

k-Nearest Neighbor分類器簡介:


k-Nearest Neighbor,簡稱KNN,翻譯過來的意思就是k鄰近分類,一個測試與已知的訓(xùn)練集中的數(shù)據(jù)進(jìn)行求歐氏距離運算,取前K個距離最短的數(shù)據(jù),然后根據(jù)前K個數(shù)據(jù)中標(biāo)簽出現(xiàn)次數(shù)最多的便為該測試的標(biāo)簽,更高的k值可以讓分類的效果更平滑,使得分類器對于異常值更有抵抗力。

KNN原理


圖像分類數(shù)據(jù)集:CIFAR-10。這個數(shù)據(jù)集包含了60000張32X32的小圖像。每張圖像都有10種分類標(biāo)簽中的一種。這60000張圖像被分為包含50000張圖像的訓(xùn)練集和包含10000張圖像的測試集。在下圖中你可以看見10個類的10張隨機(jī)圖片。

CIFAR-10數(shù)據(jù)內(nèi)容

最簡單的求兩個數(shù)據(jù)差異化的方法就是把每個像素相減求平方和,即計算歐氏距離。若不考慮平方的放大效果,可直接做差求和,換句話說,就是將兩張圖片先轉(zhuǎn)化為兩個向量,然后計算他們的距離d:

過程如下:
求兩張圖片差異
求兩張圖片差異

根據(jù)測試圖像和已知數(shù)據(jù)進(jìn)行比較后可以的得出當(dāng)前test image和training image的距離關(guān)系,在高維度下不好表示,我們將其想象成二維的im(x,y)。然后我們找出距離最近的K個training image的標(biāo)簽,標(biāo)簽出現(xiàn)次數(shù)最多的就是當(dāng)前test image的標(biāo)簽了。

KNN.jpg

Python實現(xiàn)過程


<li>k_nearest_neighbor.py

#coding: utf-8
import numpy as np

class KNearestNeighbor(object):
  def __init__(self):
    pass

  def train(self, X, y):
    """
    Train the classifier. For k-nearest neighbors this is just 
    memorizing the training data.

    Inputs:
    - X: A numpy array of shape (num_train, D) containing the training data
      consisting of num_train samples each of dimension D.
    - y: A numpy array of shape (N,) containing the training labels, where
         y[i] is the label for X[i].
    """
    self.X_train = X
    self.y_train = y
    
  def predict(self, X, k=1, num_loops=0):
    """
    Predict labels for test data using this classifier.

    Inputs:
    - X: A numpy array of shape (num_test, D) containing test data consisting
         of num_test samples each of dimension D.
    - k: The number of nearest neighbors that vote for the predicted labels.
    - num_loops: Determines which implementation to use to compute distances
      between training points and testing points.

    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i].  
    """
    if num_loops == 0:
      dists = self.compute_distances_no_loops(X)
    elif num_loops == 1:
      dists = self.compute_distances_one_loop(X)
    elif num_loops == 2:
      dists = self.compute_distances_two_loops(X)
    else:
      raise ValueError('Invalid value %d for num_loops' % num_loops)

    return self.predict_labels(dists, k=k)

  def compute_distances_two_loops(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using a nested loop over both the training data and the 
    test data.

    Inputs:
    - X: A numpy array of shape (num_test, D) containing test data.

    Returns:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      is the Euclidean distance between the ith test point and the jth training
      point.
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in xrange(num_test):
      for j in xrange(num_train):
        train = self.X_train[j,:]
        test =  X[i,:]
        distence = np.sqrt(np.sum((test-train)**2))#Calculate the eyclidean distance
        dists[i,j]=distence
    return dists

  def compute_distances_one_loop(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using a single loop over the test data.

    Input / Output: Same as compute_distances_two_loops
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in xrange(num_test):
      dis_array = X[i,:]-self.X_train
      dists[i,:] = np.sqrt(np.sum(dis_array**2))
    return dists

  def compute_distances_no_loops(self, X):
    """
    Compute the distance between each test point in X and each training point
    in self.X_train using no explicit loops.

    Input / Output: Same as compute_distances_two_loops
    """
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train)) 
    M = np.dot(X, self.X_train.T)
    te = np.square(X).sum(axis = 1)
    tr = np.square(self.X_train).sum(axis = 1)
    dists = np.sqrt(-2*M+tr+np.matrix(te).T)
    dists = np.array(dists)
    return dists

  def predict_labels(self, dists, k=1):
    """
    Given a matrix of distances between test points and training points,
    predict a label for each test point.

    Inputs:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      gives the distance betwen the ith test point and the jth training point.

    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i].  
    """
    num_test = dists.shape[0]
    y_pred = np.zeros(num_test)
    for i in xrange(num_test):
      # A list of length k storing the labels of the k nearest neighbors to
      # the ith test point.
      closest_y = []
      idx = np.argsort(dists[i,:],-1)
      closest_y = self.y_train[idx[:k]]
      closest_set = set(closest_y)#find max label
      for idx,item in enumerate(closest_set):
        y_pred[i]= item
        if idx == 0:
          break
    return y_pred

詳細(xì)測試部分:


<li>TryKNN.py

# coding:utf-8

import random
import numpy as np
from data_utils import load_CIFAR10
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

cifar10_dir = 'datasets/cifar-10-batches-py'#data_path
X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)
print 'Training data shape: ', X_train.shape
print 'Training labels shape: ', y_train.shape
print 'Test data shape: ', X_test.shape
print 'Test labels shape: ', y_test.shape

num_training = 5000 #the trainning number
mask = range(num_training) #create range number 
X_train = X_train[mask]
y_train = y_train[mask]

num_test = 500 #the test number
mask = range(num_test)
X_test = X_test[mask]
y_test = y_test[mask]

# Reshape the image data into rows
X_train = np.reshape(X_train, (X_train.shape[0], -1))#-1 mean auto number 
X_test = np.reshape(X_test, (X_test.shape[0], -1))
print X_train.shape, X_test.shape

from classifiers import KNearestNeighbor#import

classifier = KNearestNeighbor()
#classifier.train(X_train, y_train)#data and lable
#dists = classifier.compute_distances_no_loops(X_test)
#print dists.shape

#classifier the test and mark the label
#y_test_pred = classifier.predict_labels(dists, k=7)
#num_correct = np.sum(y_test_pred == y_test)
#accuracy = float(num_correct) / num_test
#print 'Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy)

#compare the different function 
def time_function(f, *args):
  """
  Call a function f with args and return the time (in seconds) that it took to execute.
  """
  import time
  tic = time.time()
  f(*args)
  toc = time.time()
  return toc - tic

#two_loop_time = time_function(classifier.compute_distances_two_loops, X_test)
#print 'Two loop version took %f seconds' % two_loop_time

#one_loop_time = time_function(classifier.compute_distances_one_loop, X_test)
#print 'One loop version took %f seconds' % one_loop_time
#the faster than anyother
#no_loop_time = time_function(classifier.compute_distances_no_loops, X_test)
#print 'No loop version took %f seconds' % no_loop_time


num_folds = 5
k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]

X_train_folds = []
y_train_folds = []

X_train_folds = np.array_split(X_train, num_folds);#split the array
y_train_folds = np.array_split(y_train, num_folds);

k_to_accuracies = {}
for k in k_choices:
    k_to_accuracies[k] = []

for k in k_choices:#find the best k-value
    for i in range(num_folds):
        X_train_cv = np.vstack(X_train_folds[:i]+X_train_folds[i+1:])
        X_test_cv = X_train_folds[i]

        y_train_cv = np.hstack(y_train_folds[:i]+y_train_folds[i+1:])  #size:4000
        y_test_cv = y_train_folds[i]

        classifier.train(X_train_cv, y_train_cv)
        dists_cv = classifier.compute_distances_no_loops(X_test_cv)
    
        y_test_pred = classifier.predict_labels(dists_cv, k)
        num_correct = np.sum(y_test_pred == y_test_cv)
        accuracy = float(num_correct) / y_test_cv.shape[0]

        k_to_accuracies[k].append(accuracy)
for k in sorted(k_to_accuracies):
    for accuracy in k_to_accuracies[k]:
        print 'k = %d, accuracy = %f' % (k, accuracy)

        # plot the raw observations
for k in k_choices:
  accuracies = k_to_accuracies[k]
  plt.scatter([k] * len(accuracies), accuracies)

# plot the trend line with error bars that correspond to standard deviation
accuracies_mean = np.array([np.mean(v) for k,v in sorted(k_to_accuracies.items())])
accuracies_std = np.array([np.std(v) for k,v in sorted(k_to_accuracies.items())])
plt.errorbar(k_choices, accuracies_mean, yerr=accuracies_std)
plt.title('Cross-validation on k')
plt.xlabel('k')
plt.ylabel('Cross-validation accuracy')
plt.show()

# Based on the cross-validation results above, choose the best value for k,   
# retrain the classifier using all the training data, and test it on the test
# data. You should be able to get above 28% accuracy on the test data.
best_k = 1

classifier = KNearestNeighbor()
classifier.train(X_train, y_train)
y_test_pred = classifier.predict(X_test, k=best_k)

# Compute and display the accuracy
num_correct = np.sum(y_test_pred == y_test)
accuracy = float(num_correct) / num_test
print 'Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy)
K值對準(zhǔn)確率的影響

KNN分類器的優(yōu)劣:


首先,Nearest Neighbor分類器易于理解,實現(xiàn)簡單。其次,算法的訓(xùn)練不需要花時間,因為其訓(xùn)練過程只是將訓(xùn)練集數(shù)據(jù)存儲起來。
然而測試要花費大量時間計算,因為每個測試圖像需要和所有存儲的訓(xùn)練圖像進(jìn)行比較,這顯然是一個缺點。
總體來說KNN分類器的訓(xùn)練花費非常小,而實際的識別開銷非常大,在不進(jìn)行特征提取的情況下很難運用到時間當(dāng)中去。

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • 一、前言 CS231n是斯坦福大學(xué)開設(shè)的一門深度學(xué)習(xí)與計算機(jī)視覺課程,是目前公認(rèn)的該領(lǐng)域內(nèi)最好的公開課。目前,該課...
    金戈大王閱讀 4,942評論 3 7
  • 跟著cs231n assignment1的knn部分的notebook引導(dǎo),把這個作業(yè)做完了。knn的算法本身很簡...
    xionghuisquall閱讀 5,542評論 0 1
  • 前言: 以斯坦福cs231n課程的python編程任務(wù)為主線,展開對該課程主要內(nèi)容的理解和部分?jǐn)?shù)學(xué)推導(dǎo)。該課程相關(guān)...
    卑鄙的我_閱讀 4,846評論 1 5
  • 繼續(xù)說《大學(xué)》,本章第三句是:帝典曰:“克明峻德",此語出自《尚書.虞夏書》之《堯典》,主要用來贊嘆堯帝光明磊落、...
    蓮連閱讀 11,861評論 0 1
  • 感賞今天和老媽一起逛街,老媽送了我一套新的床單,棒棒噠~感賞和媽媽相處開心 最喜歡和媽媽一起逛街了 感賞今天老弟請...
    童欣怡_中閱讀 335評論 0 0

友情鏈接更多精彩內(nèi)容