有下面這個數(shù)據(jù)
分A和B兩組人群
下面4行是不同疾病患病數(shù)

image.png

# 首先我們建立一個dataframe
dat <- data.frame(low=c(13,7,21,6),
                  high=c(77,22,21,71))
# 而A組總共有66個樣本，B組有128個樣本
total_no <- c(66,128)

# 先以dat第一行建立一個四格表
#  low high
#  13   77
#  53   51
tmp <- chisq.test(rbind(dat[1,], total_no-dat[1,]))
# 提取卡方和p值
tmp$statistic
tmp$p.value

image.png

# 其實可以手動計算另外3行，但是想試一試循環(huán)
# 先建立一個空的向量
k <- rep(NA, 4)
p <- rep(NA, 4)  
# 接下來開始循環(huán)
for (i in c(1:4)) {
  a <- chisq.test(rbind(dat[i,], total_no-dat[i,]))
  k[i] <- a$statistic
  p[i] <- a$p.value
}

results <- rbind(k,p)
results

最后得到結(jié)果

image.png

故事還沒有結(jié)束。。。。
用SPSS做出的結(jié)果和R的結(jié)果有出入

而R做出來的卡方值是

image.png

為什么？為什么？

尋找原因

R的數(shù)值錄入有問題？

所以重新錄入，模仿SPSS
使用t()函數(shù)對數(shù)據(jù)進(jìn)行轉(zhuǎn)化

image.png

dat <- data.frame(low=c(13,7,21,6),
                 high=c(77,22,21,71))
total_no <- c(66,128)

# 在這步加入t()轉(zhuǎn)換
tmp <- chisq.test(t(rbind(dat[1,], total_no-dat[1,])))
tmp$statistic
tmp$p.value

但是結(jié)果依舊是

image.png

R和SPSS的參數(shù)不同？

查看R的幫助文檔，發(fā)現(xiàn)蛛絲馬跡

image.png

原來有一個叫Yates Correction的東西在搞鬼（主要是我的統(tǒng)計知識太菜）
再次跑R

image.png

bingo！和SPSS的卡方值一樣了

Yates Correction是什么東西

以下參考：
https://www.statisticshowto.datasciencecentral.com/what-is-the-yates-correction/

為什么要用yates correction？

The Yates correction is a correction made to account for the fact that both Pearson’s chi-square test and McNemar’s chi-square test are biased upwards for a 2 x 2 contingency table. An upwards bias tends to make results larger than they should be. If you are creating a 2 x 2 contingency table that uses either of these two tests, the Yates correction is usually recommended, especially if the expected cell frequencies are below 10 (some authors put that figure at 5).

Chi2 tests are biased upwards when used on 2 x 2 contingency tables. The reason is that the statistical Chi2 distribution is continuous and the 2 x 2 contingency table is dichotomous (in other words, it isn’t continuous, there are two variables). All you really need to know is that if your expected cell frequencies are below 10, you probably should be using the Yates correction.

而R默認(rèn)是使用yates correction，所以有了上面這個故事。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

R和SPSS計算的卡方值和p值不一樣，WHY

R和SPSS計算的卡方值和p值不一樣，WHY

尋找原因

R的數(shù)值錄入有問題？

R和SPSS的參數(shù)不同？

Yates Correction是什么東西

為什么要用yates correction？

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

R和SPSS計算的卡方值和p值不一樣，WHY

尋找原因

R的數(shù)值錄入有問題？

R和SPSS的參數(shù)不同？

Yates Correction是什么東西

為什么要用yates correction？

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

R和SPSS計算的卡方值和p值不一樣，WHY

R的數(shù)值錄入有問題？

R和SPSS的參數(shù)不同？