Part 1 :氣泡圖
氣泡圖是散點(diǎn)圖的一種變體,一般的散點(diǎn)圖反映的是兩個(gè)連續(xù)變量之間的關(guān)系。而氣泡圖通常可以反映三個(gè)變量之間的關(guān)系,第三個(gè)變量一般體現(xiàn)在氣泡的大小。當(dāng)然,如果賦予氣泡不同的顏色,那么也可利用其反映四個(gè)變量之間的關(guān)系。
在實(shí)際使用中,氣泡圖常用于展示基因富集分析的結(jié)果。本期使用R包gapminder中現(xiàn)有數(shù)據(jù)集,基于ggplot2制作氣泡圖
Part 2 :圖像與代碼
在加載數(shù)據(jù)并對(duì)數(shù)據(jù)進(jìn)行簡(jiǎn)單的篩選后,很容易做出一個(gè)簡(jiǎn)單的氣泡圖:
#加載相關(guān)包
library(ggplot2)
library(dplyr)
#install.packages("gapminder")
library(gapminder)
#簡(jiǎn)單的數(shù)據(jù)篩選,篩選去year=2007的數(shù)據(jù),同時(shí)將"year"一列刪除
data <- gapminder %>% filter(year=="2007") %>% dplyr::select(-year)
#基本的氣泡圖
bp1 = ggplot(data, aes(x=gdpPercap, y=lifeExp, size = pop)) +
geom_point(alpha=0.7)
#size = pop ,表示用數(shù)據(jù)中的pop值來表示氣泡的大小
在此基礎(chǔ)上,我們只需要增加億點(diǎn)點(diǎn)細(xì)節(jié),就可以得到如下的圖像:
#加載相關(guān)包
library(ggplot2)
library(dplyr)
library(hrbrthemes)
library(viridis)
library(ggrepel)
#此處對(duì)數(shù)據(jù)做了簡(jiǎn)單處理,將pop統(tǒng)一縮??;把數(shù)據(jù)按pop(氣泡大?。┙敌蚩梢员苊獯笕Τ霈F(xiàn)在圖像s行方
tmp_data <- data %>%
mutate(pop=pop/1000000) %>%
arrange(desc(pop)) %>%
mutate(country = factor(country, country))
bp2 <- ggplot(tmp_data, aes(x = gdpPercap, y=lifeExp, size = pop, color = continent)) +
geom_point(alpha=0.5) +
scale_size(range = c(1.5, 20), name="Population (M)") +
scale_color_viridis(discrete=TRUE) +
theme_ipsum() +
theme(
legend.position = c(1, 0),
legend.justification = c(1, 0))+
geom_text_repel(data = tmp_data, aes(label=country), size=3) #安裝country給所有氣泡加注釋
上圖中,我們?yōu)樗袣馀菰黾恿俗⑨專坪跤^感并不好。我們可以有選擇的為部分感興趣的氣泡加注釋。如果將代碼
tmp_data <- data %>%
mutate(pop=pop/1000000) %>%
arrange(desc(pop)) %>%
mutate(country = factor(country, country))
#及
geom_text_repel(data = tmp_data, aes(label=country), size=3)
改為:
#篩選感興趣的數(shù)據(jù),并為其加注釋
tmp_data <- data %>%
mutate(
annotation = case_when(
gdpPercap > 5000 & lifeExp < 60 ~ "yes",
lifeExp < 30 ~ "yes",
gdpPercap > 40000 ~ "yes")
) %>%
mutate(pop=pop/1000000) %>%
arrange(desc(pop)) %>%
mutate(country = factor(country, country))
#及
geom_text_repel(data=tmp_data %>% filter(annotation=="yes"), aes(label=country), size=3 )
可以得到下圖:
在此基礎(chǔ)上,我們可以根據(jù)需要修改氣泡的大小、配色方案等,以制作出所需氣泡圖
在上期散點(diǎn)圖(1)— 基礎(chǔ)散點(diǎn)圖中,我們復(fù)現(xiàn)了Nature Communications文章中的一幅散點(diǎn)圖,并給出了完整代碼。實(shí)際上,該文章中還使用了如下的散點(diǎn)圖,在此我們補(bǔ)充給出復(fù)現(xiàn)代碼
library(ggplot2)
cols <- c("#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
#fig2:
crass_impact <- read.table("crass_impact.txt")
p = ggplot(crass_impact, aes(x = rel_crAss, y = rel_res, color = country)) +
geom_smooth(method = "lm") +
geom_point(aes(shape = crAss_detection), size =9 ) +
scale_x_log10() +
scale_y_log10() +
theme_classic() +
labs(y = "Normalized ARG abundance", x="Normalized crAssphage abundance",
color = "Study", shape = "crAssphage detection") + scale_colour_manual(values = cols)
library(ggplot2)
crass_wwtp <- read.table("crass_wwtp.txt")
p4 <- ggplot(crass_wwtp, aes(rel_crAss, rel_res, color = country_wwtp)) +
geom_smooth(method = "lm") +
geom_point(size = 8) +
scale_x_log10() +
scale_y_log10() +
theme_classic() +
scale_colour_manual(values = cols) +
labs(y = "Normalized ARG abundance", x="Normalized crAssphage abundance",
color = "Country:WWTP")+
theme(
legend.position = c(0.1, 1),
legend.justification = c(0.1, 1)) #注意,此處的刻度并非實(shí)際途中標(biāo)尺刻度;可以理解為繪圖區(qū)域?yàn)橐粋€(gè)1x1的坐標(biāo)系,0.5x0.5為中心點(diǎn)
以上使用到的數(shù)據(jù)集,均可在散點(diǎn)圖(1)— 基礎(chǔ)散點(diǎn)圖文中提供的鏈接中獲取
參考:
歡迎關(guān)注公眾號(hào):生信小書生
定期分享各類生信知識(shí)、技能