99资源精品一区,婷婷激情不卡二区

1.產(chǎn)生異常值的原因

image.png

2.異常值檢測／刪除算法

a.訓練所有的數(shù)據(jù)
b.去除錯誤的點，一般占10%
c.對當前減小后的數(shù)據(jù)集在進行訓練

3.迷你項目

此項目有兩部分。在第一部分中將運行回歸，然后識別并刪除具有最大殘差的 10% 的點。然后，根據(jù) Sebastian 在課程視頻中所建議的，從數(shù)據(jù)集中刪除那些異常值并重新擬合回歸。

在第二部分中，你將熟悉安然財務數(shù)據(jù)中的一些異常值，并且了解是否/如何刪除它們。

帶有異常值的回歸斜率

Sebastian 向我們描述了改善回歸的一個算法，你將在此項目中實現(xiàn)該算法。你將在接下來的幾個測試題中運用這一算法?？偟膩碚f，你將在所有訓練點上擬合回歸。舍棄在實際 y 值和回歸預測 y 值之間有最大誤差的 10% 的點。

先開始運行初始代碼 (outliers/outlier_removal_regression.py) 和可視化點。一些異常值應該會跳出來。部署一個線性回歸，其中的凈值是目標，而用來進行預測的特征是人的年齡（記得在訓練數(shù)據(jù)上進行訓練！）。

數(shù)據(jù)點主體的正確斜率是 6.25（我們之所以知道，是因為我們使用該值來生成數(shù)據(jù)）；你的回歸的斜率是多少？

#!/usr/bin/python

import random
import numpy
import matplotlib.pyplot as plt
import pickle

from outlier_cleaner import outlierCleaner


### load up some practice data with outliers in it
ages = pickle.load( open("practice_outliers_ages.pkl", "r") )
net_worths = pickle.load( open("practice_outliers_net_worths.pkl", "r") )



### ages and net_worths need to be reshaped into 2D numpy arrays
### second argument of reshape command is a tuple of integers: (n_rows, n_columns)
### by convention, n_rows is the number of data points
### and n_columns is the number of features
ages       = numpy.reshape( numpy.array(ages), (len(ages), 1))
net_worths = numpy.reshape( numpy.array(net_worths), (len(net_worths), 1))
from sklearn.cross_validation import train_test_split
ages_train, ages_test, net_worths_train, net_worths_test = train_test_split(ages, net_worths, test_size=0.1, random_state=42)

### fill in a regression here!  Name the regression object reg so that
### the plotting code below works, and you can see what your regression looks like


from sklearn.linear_model import LinearRegression
reg = LinearRegression()
reg.fit(ages_train,net_worths_train)
y_pred = reg.predict(ages_test)
print 'Slope %.2f' % reg.coef_
print 'Score %.2f' % reg.score(ages_test,net_worths_test)








try:
    plt.plot(ages, reg.predict(ages), color="blue")
except NameError:
    pass
plt.scatter(ages, net_worths)
plt.show()


### identify and remove the most outlier-y points
cleaned_data = []
try:
    predictions = reg.predict(ages_train)
    cleaned_data = outlierCleaner( predictions, ages_train, net_worths_train )
except NameError:
    print "your regression object doesn't exist, or isn't name reg"
    print "can't make predictions to use in identifying outliers"







### only run this code if cleaned_data is returning data
if len(cleaned_data) > 0:
    ages, net_worths, errors = zip(*cleaned_data)
    ages       = numpy.reshape( numpy.array(ages), (len(ages), 1))
    net_worths = numpy.reshape( numpy.array(net_worths), (len(net_worths), 1))

    ### refit your cleaned data!
    try:
        reg.fit(ages, net_worths)
        plt.plot(ages, reg.predict(ages), color="blue")
    except NameError:
        print "you don't seem to have regression imported/created,"
        print "   or else your regression object isn't named reg"
        print "   either way, only draw the scatter plot of the cleaned data"
    plt.scatter(ages, net_worths)
    plt.xlabel("ages")
    plt.ylabel("net worths")
    plt.show()


else:
    print "outlierCleaner() is returning an empty list, no refitting to be done"

image.png

清理后的斜率

你將在 outliers/outlier_cleaner.py 中找到 outlierCleaner() 函數(shù)的骨架并向其填充清理算法。用到的三個參數(shù)是：predictions 是一個列表，包含回歸的預測目標；ages 也是一個列表，包含訓練集內(nèi)的年齡；net_worths 是訓練集內(nèi)凈值的實際值。每個列表中應有 90 個元素（因為訓練集內(nèi)有 90 個點）。你的工作是返回一個名叫cleaned_data 的列表，該列表中只有 81 個元素，也即預測值和實際值 (net_worths) 具有最小誤差的 81 個訓練點 (90 * 0.9 = 81)。cleaned_data 的格式應為一個元組列表，其中每個元組的形式均為 (age, net_worth, error)。

一旦此清理函數(shù)運行起來，你應該能看到回歸結果發(fā)生了變化。新斜率是多少？是否更為接近 6.25 這個“正確”結果？

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Udacity-異常值

Udacity-異常值

1.產(chǎn)生異常值的原因

2.異常值檢測／刪除算法

3.迷你項目

帶有異常值的回歸斜率

清理后的斜率

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Udacity-異常值

1.產(chǎn)生異常值的原因

2.異常值檢測／刪除算法

3.迷你項目

帶有異常值的回歸斜率

清理后的斜率

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av