最近數(shù)據(jù)挖掘的一些雜七雜八

拉普拉斯平滑:

最簡單的例子:
中國男足vs韓國男足的前5場的比分是0:5,那預(yù)測第六場中國隊(duì)勝出的概率是多少時難道給0/5,這絕壁不行。所以分子分母都加1,變成1/6。

貝葉斯網(wǎng)絡(luò):

貝葉斯網(wǎng)絡(luò).png

p(a,b,c) = p(c|a,b)p(b|a)p(a)

馬爾科夫鏈:

貝葉斯網(wǎng)絡(luò)拉成一條線,并假設(shè)當(dāng)前節(jié)點(diǎn)發(fā)生的概率只與當(dāng)前節(jié)點(diǎn)的前一個節(jié)點(diǎn)有關(guān)。

時間序列:

時間序列簡單的說就是各時間點(diǎn)上形成的數(shù)值序列,時間序列分析就是通過觀察歷史數(shù)據(jù)預(yù)測未來的值。在這里需要強(qiáng)調(diào)一點(diǎn)的是,時間序列分析并不是關(guān)于時間的回歸,它主要是研究自身的變化規(guī)律的(這里不考慮含外生變量的時間序列)。

決策樹

SLIQ:

introduce:
SLIQ stands for Supervised Learning In Quest, where Quest is the Data Mining project at the IBM Almaden Research Center.
SLIQ is a decision tree classifier that can handle both numeric and categorical attributes.
advantages:
SLIQ uses the novel techniques of pre-sorting, breadth first growth, and MDL-based pruning.
pre-sorting:
SLIQ uses a pre-sorting technique in the tree-growth phase to reduce the cost of evaluating numeric attributes.
MDL-based pruning:
SLIQ also uses a new tree-pruning algorithm based on the Minimum Description Length principle [11]. This algorithm is inexpensive, and results in compact and accurate trees.

Best-First decision:

different :
The only difference is that, standard decision tree learning expands nodes in depth-first order, while best-first decision tree learning expands the ”best” node first.
split is the split with the maximal reduction of impurity

different between best-first and depth-first

In this example, considering the fully-expanded best-first decision tree the benefit of expanding node N2 is greater than the benefit of expanding N3.

two splitting criteria to measure impurity:
Gini gain,information gain
These splitting criteria were introduced to measure impurity of a node.

splitting rules:
split with the maximal reduction of impurity
對于連續(xù)型/數(shù)值型變量,對特征進(jìn)行預(yù)排序,尋找最佳分割點(diǎn)

The method of dealing with missing values:
以不同的權(quán)重進(jìn)入不同的分支

pruning method:pre-pruning post-pruning:
As mentioned before, pre-pruning stops splitting when the splitting cannot improve predictive performance.
In other words, post-pruning prunes off branches which do not improve accuracy.

梯度提升 (GB)

AnyBoost

設(shè)C是損失函數(shù) C是關(guān)于F的函數(shù)
F是一個弱學(xué)習(xí)器 ~F是一個弱學(xué)習(xí)器的集合
F' 代表F的導(dǎo)數(shù), 我們要找到一個f屬于~F,
使得<-F',f>最大,<,>代表內(nèi)積,內(nèi)積大,代表相似度高
當(dāng)內(nèi)積小于零,我們停止迭代
內(nèi)積,損失函數(shù),步長根據(jù)特定情況規(guī)定

A gradient descent view of voting methods

規(guī)定內(nèi)積:


內(nèi)積1,G,F為弱學(xué)習(xí)器

反向梯度的公式:


梯度
Existing voting methods viewed as AnyBoost on margin cost functions.

免費(fèi)午餐定理(NLF)

在沒有實(shí)際背景下,沒有一種算法比隨機(jī)胡猜的效果好
it is hopeless to dream for a learning algorithm which is consistently better than other learning algorithms.

Ensembles Methods

Boost

examples:


數(shù)據(jù).png

步驟.png

基算法.png

Bagging

boost 是順序集成
bagging 是平行集成,基學(xué)習(xí)器平行生成,利用獨(dú)立性
Bagging: Bootstrap AGGregating
采用Bootstrap sampling for training data / sampling with replacement
最常用的策略: voting for classificationg averaging for regression
Bagging有巨大的方差減小效應(yīng),對不穩(wěn)定的學(xué)習(xí)器非常有效(常見的穩(wěn)定學(xué)習(xí)器:k-nearest / neighbor classifer)

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容