Preview:
# 介绍:之前的教程中,我们学习了如何使条形图或直方图看起来更好
比如:
# 今天我们将学习如何在图形中添加信息,编辑图例中的文本元素,并改变主题
# 添加图形中的信息使用geom_text()
# 示例:在条形图上添加每个条形的计数
ggplot(data = mpg, aes(x = class)) +geom_bar() +geom_text(stat = 'count', aes(label = ..count..), vjust = -0.5)
# 编辑图例中的文本元素并改变主题使用theme()
# 示例:改变坐标轴文本的大小和位置
ggplot(data = mpg, aes(x = class)) +geom_bar() +theme(axis.text.x = element_text(angle = 45, size = 10))
# 理解数据可视化的指导原则
# 例如,平衡、强调、运动、模式、重复、节奏和多样性
# 使用散点图进行两个连续变量的数据可视化
# 使用条形图进行两个分类数据的数据可视化,并学习新的自定义设置
# 使用一个连续变量和一个分类变量进行数据可视化
Main Content
Add info in the plots:
首先,让我们来看看如何在图形中添加信息。在R中,我们可以使用geom_text()
函数来实现这一点。例如,如果我们想在条形图上显示每个条形的计数,我们可以这样做:
ggplot(data = mpg, aes(x = class)) +geom_bar() +geom_text(stat = 'count', aes(label = ..count..), vjust = -0.5)
-
ggplot(data = mpg, aes(x = class))
: This sets up the basic plot using thempg
dataset and specifies that theclass
variable should be mapped to the x-axis. -
geom_bar()
: This adds a bar plot layer to the plot, creating a bar for each unique value of theclass
variable. -
geom_text(stat = 'count', aes(label = ..count..), vjust = -0.5)
: This adds text labels to the plot. Thestat = 'count'
argument tellsgeom_text
to calculate the count of observations for each class. Theaes(label = ..count..)
specifies that the count should be used as the label for each bar. Thevjust = -0.5
argument adjusts the vertical position of the labels to place them above the bars. -
if vjust = 0.5
接下来,让我们讨论如何编辑图例中的文本元素并改变图形的主题。在R中,我们可以使用theme()
函数来实现这一点。例如,如果我们想改变坐标轴文本的大小和位置,我们可以这样做:
ggplot(data = mpg, aes(x = class)) +geom_bar() +theme(axis.text.x = element_text(angle = 45, size = 10))
Changing the text size and position in the x or y axis
+ theme(axis.text.x = element_text(angle = 45, size=10))
+ theme(axis.text.x = element_text(angle = 45,size=7))
-
family: Specifies the font family to be used for the axis text. For example, setting
family = "Arial"
would use the Arial font for the axis text. -
face: Specifies the font style to be used for the axis text. This can be used to make the text bold, italic, or bold italic. For example, setting
face = "bold"
would make the axis text bold. -
colour: Specifies the color of the axis text, ticks, and marks. For example, setting
colour = "red"
would make the axis text red. -
size: Specifies the size of the axis text. For example, setting
size = 12
would make the axis text 12 points in size. -
angle: Specifies the angle at which the axis text is displayed. For example, setting
angle = 45
would rotate the axis text 45 degrees clockwise.
remove axis ticks and labels
you can remove axis ticks and labels using element_blank()
or size=0
in theme()
in ggplot2
. Here's how you can do it:
library(ggplot2)# Create a basic plot
p <- ggplot(data = mpg, aes(x = class)) +geom_bar() +geom_text(stat = 'count', aes(label = ..count..), vjust = -0.5)# Remove x-axis ticks and labels
p + theme(axis.text.x = element_blank(),axis.ticks.x = element_blank())# Remove y-axis ticks and labels
p + theme(axis.text.y = element_blank(),axis.ticks.y = element_blank())
Add the headcount for each bar in a graph which indicate proportion
ggplot(CUHKSZ_employment_survey_1,aes(fct_infreq(Occupation),y=(..count..)/sum(..count..),fill=Occupation))+geom_bar()+geom_text(stat='count',aes(label=..count..),vjust=+1.5)
ggplot(CUHKSZ_employment_survey_1, aes(x = fct_infreq(Occupation), fill = Occupation)) +geom_bar(aes(y = (..count..)/sum(..count..))) +geom_text(stat = 'count', aes(label = ..count.., y = (..count..)/sum(..count..)), vjust = +1.5) +labs(title="Occupation of CUHK Shenzhen students after graduation",x=NULL, y="Proportion")
If you want to remove the x-axis label entirely, you can use x = ""
instead.
ggplot(CUHKSZ_employment_survey_1, aes(x = fct_infreq(Occupation), fill = Occupation)) +geom_bar(aes(y = (..count..)/sum(..count..))) +geom_text(stat = 'count', aes(label = ..count.., y = (..count..)/sum(..count..)), vjust = +1.5) +labs(title="Occupation of CUHK Shenzhen students after graduation", x = "", y = "Proportion")
If I want to underline that students are more likely to become “Professional an technician” or “Clerical personnel”, I might use the same color for those category
Scale_fill_manual(values=c(“color1”,”color2”….)
# Define custom colors
custom_colors <- c("Professional and technician" = "Red", "Clerical personnel" = "Red", "Other" = "grey")# Create the plot with custom colors
ggplot(CUHKSZ_employment_survey_1, aes(x = fct_infreq(Occupation), fill = Occupation)) +geom_bar(aes(y = (..count..)/sum(..count..))) +geom_text(stat = 'count', aes(label = ..count.., y = (..count..)/sum(..count..)), vjust = +1.5) +labs(title="Occupation of CUHK Shenzhen students after graduation", x = "", y = "Proportion") +scale_fill_manual(values = custom_colors)
If I want to underline that students more than 10% of the students become “Professional an technician” “Clerical personnel” or “managerial personnel”, colour should de different and I should add a horizontal line
+geom_hline(yintercept=0.1)
ggplot(CUHKSZ_employment_survey_1, aes(x = fct_infreq(Occupation), fill = Occupation)) +geom_bar(aes(y = (..count..)/sum(..count..))) +theme(axis.text.x =element_text(angle = 45,vjust = 0.6))+geom_text(stat = 'count', aes(label = ..count.., y = (..count..)/sum(..count..)), vjust = +1.5) +labs(title="Occupation of CUHK Shenzhen students after graduation", x = "", y = "Proportion") +scale_fill_manual(values = custom_colors) +geom_hline(yintercept=0.1)
Demonstrate that your data are normally distributed by over-ploting a Gaussian curve on your histogram
ggplot(CUHKSZ_employment_survey_1, aes(x = Monthly_salary_19, y = stat(density))) +geom_histogram(binwidth = 500, fill = "blue", colour = "black", alpha = 0.5, boundary = 8000) +geom_density(color = "red") +labs(title = "Histogram of Monthly Salary with Density Curve Overlay", x = "Monthly Salary", y = "Density")
Notice to use stat(density) here instead of ...density... , or it will report an Error
or more(
Warning message: `stat(density)` was deprecated in ggplot2 3.4.0. ℹ Please use `after_stat(density)` instead. This warning is displayed once every 8 hours. Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated. )
Underline the individuals who are overweigth in the BMI histogram = change the colour of the bar in an histogram
Decompose the histogram into two using the function subset
ggplot(SEE_students_data_2,aes(x=BMI))+geom_histogram(data=subset(SEE_students_data_2,BMI<25),fill="Blue", alpha=0.5,binwidth = 1,color="Black")+geom_histogram(data=subset(SEE_students_data_2,BMI>25),fill="Red", alpha=0.5,binwidth = 1,color="Black")