小美女av一区二区,久久伊人大,操艹射在线

過擬合問題（The Problem of Overfitting）

如上圖所示，第一個采用單變量線性回歸模型來擬合數(shù)據集，但其效果并不好，因此我們將這種情況稱為欠擬合（Underfitting）或高偏差（High Bias）；第二個采用二次多項式的線性回歸模型來擬合數(shù)據集，其效果恰好，因此我們將這種情況稱為“Just Right”；第三個采用四次多項式的線性回歸模型來擬合數(shù)據集，其雖然對數(shù)據集擬合的非常好，但其曲線忽上忽下難以針對新數(shù)據進行預測，因此我們將這種情況稱為過擬合（Overfitting）或高方差（ high variance）。

除此之外，邏輯回歸模型也存在上述情況，如下圖所示：

根據在線性回歸模型中的分析，我們不難得知第一個為欠擬合，第二個最合適，第三個過擬合。

現(xiàn)在我們來看看過擬合的定義：

即若數(shù)據集中存在許多特征變量，我們通過使用高次方多項式來擬合數(shù)據集，其看似將數(shù)據集中的每個數(shù)據都擬合得很好，但其對于新數(shù)據的處理就無法做得很好，即泛化較差（泛化指一個假設模型能應用到新樣板的能力），這時我們將其稱為過擬合。

Question:
Consider the medical diagnosis problem of classifying tumors as malignant or begin. If a hypothesis h_θ(x) has overfit the training set, it means that:
A. It makes accurate predictions for examples in the training set and generalizes well to make accurate predictions on new, previously unseen examples.
B. It does not make accurate predictions for examples in the training set, but it does generalize well to make accurate predictions on new, previously unseen example.
C. It makes accurate predictions for examples in the training set, but it does not generalize well to make accurate predictions on new, previously unseen examples.
D. It does not make accurate predictions for examples in the training set and does not generalize well to make accurate predictions on new, previously unseen examples.

根據過擬合的定義我們不難得知C為正確答案。

針對過擬合問題，我們有如下方法來解決：

減少特征變量的個數(shù)：
- 人工選擇特征變量
- 使用模型選擇算法，自動選擇特征變量
正則化：保留所有特征變量，但減小參數(shù)θ_j的值

補充筆記

The Problem of Overfitting

Consider the problem of predicting y from x ∈ R. The leftmost figure below shows the result of fitting a y = θ₀+θ₁x to a dataset. We see that the data doesn’t really lie on straight line, and so the fit is not very good.

Underfitting, or high bias, is when the form of our hypothesis function h maps poorly to the trend of the data. It is usually caused by a function that is too simple or uses too few features. At the other extreme, overfitting, or high variance, is caused by a hypothesis function that fits the available data but does not generalize well to predict new data. It is usually caused by a complicated function that creates a lot of unnecessary curves and angles unrelated to the data.

This terminology is applied to both linear and logistic regression. There are two main options to address the issue of overfitting:

Reduce the number of features:
- Manually select which features to keep.
- Use a model selection algorithm (studied later in the course).
Regularization
- Keep all the features, but reduce the magnitude of parameters θ_j.
- Regularization works well when we have a lot of slightly useful features.

代價函數(shù)（Cost Function）

若假設函數(shù)h_θ(x) = θ₀ + θ₁x₁ + θ₂x₂² + θ₃x₃³ + θ₄x₄⁴，則會出現(xiàn)對下圖數(shù)據集過擬合的情況。

現(xiàn)假設所有的特征變量x都是非常重要的，因此我們不能舍棄任何一個特征變量x。為了解決這個問題，我們使用正則化的方法將參數(shù)θj的值變小。

為此我們需要將代價函數(shù)J(θ)修改為如下圖所示那樣：

當我們使用梯度下降算法或其他高級算法來求得了參數(shù)θ的值來使得代價函數(shù)J(θ)最小化時，其θ₃和θ₄的值相比之前對新數(shù)據預測的影響要小。為什么呢？

這時因為我們通過使用正則化方法，在求得代價函數(shù)J(θ)最小化時，其θ₃和θ₄的值會無限接近于0。因此，假設函數(shù)h_θ(x)甚至可以改寫為h_θ(x) = θ₀ + θ₁x₁ + θ₂x₂²。

如若某個數(shù)據集中有非常多的特征變量x且每個特征變量都非常重要，為了避免過擬合問題，我們可將代價函數(shù)J(θ)修改為：

其中λ稱為正則化參數(shù)（Regularization Parameter）。因此，我們將這種方法稱為正則化。

注：此處我們無需考慮θ₀。

對于正則化參數(shù)λ的選擇我們也要慎重，一旦其值過大，則θ₁，θ₂，θ₃和θ₄都會無限接近于0。此時，假設函數(shù)h_θ(x)甚至可以改寫為h_θ(x) = θ₀。

其結果如圖中紅線所示，這樣就出現(xiàn)了欠擬合問題。

補充筆記

Cost Function

If we have overfitting from our hypothesis function, we can reduce the weight that some of the terms in our function carry by increasing their cost.

Say we wanted to make the following function more quadratic:

We'll want to eliminate the influence of θ₃x³ and θ₄x⁴ . Without actually getting rid of these features or changing the form of our hypothesis, we can instead modify our cost function:

We've added two extra terms at the end to inflate the cost of θ₃ and θ₄. Now, in order for the cost function to get close to zero, we will have to reduce the values of θ₃ and θ₄ to near zero. This will in turn greatly reduce the values of θ₃x³ and θ₄x⁴ in our hypothesis function. As a result, we see that the new hypothesis (depicted by the pink curve) looks like a quadratic function but fits the data better due to the extra small terms θ₃x³ and θ₄x⁴.

We could also regularize all of our theta parameters in a single summation as:

The λ, or lambda, is the regularization parameter. It determines how much the costs of our theta parameters are inflated.

Using the above cost function with the extra summation, we can smooth the output of our hypothesis function to reduce overfitting. If lambda is chosen to be too large, it may smooth out the function too much and cause underfitting.

正則化的線性回歸（Regularized Linear Regression）

正則化的代價函數(shù)J(θ)為：

現(xiàn)在我們使用學過的梯度下降算法和正規(guī)方程法來求出使得代價函數(shù)J(θ)最小化的參數(shù)θ的值。

梯度下降算法

由于在正則化過程中，我們不對θ₀做任何處理，于是梯度下降算法的表達式為：

對于j=1, 2, 3, ...時的迭代表達式可改寫為：

其中1-α*λ/m﹤1一定成立。

正規(guī)方程

正則化的正規(guī)方程的公式為：

其中L矩陣為(n+1)*(n+1)。

對于樣本數(shù)量m小于特征變量x的個數(shù)n時，X^TX為不可逆矩陣（奇異矩陣），若如我們在Octave中使用pinv()函數(shù)則可求出其偽逆矩陣，但使用inv()則無法求出其可逆矩陣。

注：對于樣本數(shù)量m等于特征變量x的個數(shù)n時，X^TX可能為不可逆矩陣（奇異矩陣）。

存在正則化參數(shù)λ﹥0時，即使當樣本數(shù)量m小于等于特征變量x的個數(shù)n時，X^TX為不可逆矩陣，也可使用inv()求出其可逆矩陣。

補充筆記

Regularized Linear Regression

We can apply regularization to both linear regression and logistic regression. We will approach linear regression first.

Gradient Descent

We will modify our gradient descent function to separate out θ₀ from the rest of the parameters because we do not want to penalize θ₀.

Normal Equation

Now let's approach regularization using the alternate method of the non-iterative normal equation.

To add in regularization, the equation is the same as our original, except that we add another term inside the parentheses:

L is a matrix with 0 at the top left and 1's down the diagonal, with 0's everywhere else. It should have dimension (n+1)×(n+1). Intuitively, this is the identity matrix (though we are not including x₀), multiplied with a single real number λ.

Recall that if m < n, then X^TX is non-invertible. However, when we add the term λ?L, then X^TX + λ?L becomes invertible.

正則化的邏輯回歸（Regularized Logistic Regression）

正則化的邏輯回歸模型的代價函數(shù)J(θ)為：

梯度下降算法

其中h_θ(x) = g(θ^TX)。

高級優(yōu)化算法

首先，創(chuàng)建costFunction.m文件并在文件中按如下圖所示寫出相關函數(shù)代碼：

然后，如之前在邏輯回歸（二）一文中所講，在Octave中調用fminunc()函數(shù)，具體操作可回顧邏輯回歸（二）一文。

補充筆記

Regularized Logistic Regression

We can regularize logistic regression in a similar way that we regularize linear regression. As a result, we can avoid overfitting. The following image shows how the regularized function, displayed by the pink line, is less likely to overfit than the non-regularized function represented by the blue line:

Cost Function

Recall that our cost function for logistic regression was:

We can regularize this equation by adding a term to the end:

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

正則化

正則化

過擬合問題（The Problem of Overfitting）

補充筆記

The Problem of Overfitting

代價函數(shù)（Cost Function）

補充筆記

Cost Function

正則化的線性回歸（Regularized Linear Regression）

補充筆記

Regularized Linear Regression

正則化的邏輯回歸（Regularized Logistic Regression）

補充筆記

Regularized Logistic Regression

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

正則化

過擬合問題（The Problem of Overfitting）

補充筆記

The Problem of Overfitting

代價函數(shù)（Cost Function）

補充筆記

Cost Function

正則化的線性回歸（Regularized Linear Regression）

補充筆記

Regularized Linear Regression

正則化的邏輯回歸（Regularized Logistic Regression）

補充筆記

Regularized Logistic Regression

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av