多元線性回歸
Day 3的任務(wù)是多元線性回歸. 開(kāi)始任務(wù)~

Screen Shot 2019-01-11 at 1.13.33 PM.png

Screen Shot 2019-01-14 at 4.15.05 PM.png

Screen Shot 2019-01-14 at 4.15.20 PM.png
Step1 Data Preprocessing

Screen Shot 2019-01-14 at 4.19.07 PM.png
首先我們import numpy, pandas, matplotlib. 使用pandas來(lái)read數(shù)據(jù)集. 使用sklearn來(lái)分配訓(xùn)練集和測(cè)試集. test_size為五分之一. 注意, 這里有必要的話, 我們需要編輯虛擬向量并注意避免虛擬變量陷阱.
code如下:
# Step1 Data Preprocessing
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Import Datasets
dataset = pd.read_csv('../datasets/50_Startups.csv')
X = dataset.iloc[ : , :-1].values
Y = dataset.iloc[ : , 4 ].values
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder = LabelEncoder()
X[: , 3] = labelencoder.fit_transform(X[ : , 3])
onehotencoder = OneHotEncoder(categorical_features= [3])
X = X[: , 1:]
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.2, random_state = 0)
print("X_train")
print(X_train)
print("X_test")
print(X_test)
print("Y_train")
print(Y_train)
print("Y_test")
print(Y_test)
Step2 train the model by using multiple linear regression

Screen Shot 2019-01-14 at 4.19.14 PM.png
線性回歸來(lái)訓(xùn)練我們的數(shù)據(jù)集.
code如下:
# Step2 train by using multiple linear regression
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, Y_train)
Step3 Prediction Outcome

Screen Shot 2019-01-14 at 4.19.17 PM.png
我們可以使用predict來(lái)預(yù)測(cè)輸出, 將輸出保存到Y(jié)_pred中, 然后打印出來(lái).
code如下:
#Step 3: Prediction Outcome
Y_pred = regressor.predict(X_test)
print('Y_pred')
print(Y_pred)
Step4 Visulization
最后一步, 我們使用matplotlib來(lái)可視化我們的結(jié)果. 這里我們可以把X_train和X_test可視化出來(lái).

Day3_1.png

Day3_2.png
code如下:
#Step 4: Visulization
plt.plot(X_train, regressor.predict(X_train), color = 'blue')
plt.show()
plt.plot(X_test, regressor.predict(X_test), color = 'blue')
plt.show()