DPI R Bootcamp
Jared Knowles
In this lesson we hope to learn:
names(object)
helpshist(df$readSS)
plot(df$readSS, df$mathSS)
plot(df$readSS, df$mathSS)
lines(lowess(df$readSS ~ df$mathSS), col = "red")
ggplot2
is pretty much the new standard in Rlibrary(ggplot2)
qplot(readSS, mathSS, data = df)
ggplot2
an R package does just this by breaking plots into a few basic componentsqplot(readSS, mathSS, data = df, alpha = I(0.3)) + theme_dpi()
readSS
is the x coordinate and mathSS
is the y coordinate for each observation in our datadf$mathSS
using 3 separate geomsqplot(mathSS, readSS, data = df) + theme_dpi()
qplot(mathSS, data = df) + theme_dpi()
qplot(factor(grade), mathSS, data = df, geom = "line", group = stuid, alpha = I(0.2)) +
theme_dpi()
ggplot2
has an extended syntax that makes this obviousggplot(df, aes(x = readSS, y = mathSS)) + geom_point()
# Identical to: qplot(readSS,mathSS,data=df)
aes
says we are specifying aesthetics, here we specified x and y to make a two dimensional graphicdata(mpg)
qplot(displ, cty, data = mpg) + theme_dpi()
qplot(displ, cty, data = mpg, size = cyl) + theme_dpi()
qplot(displ, cty, data = mpg, shape = drv, size = I(3)) + theme_dpi()
qplot(displ, cty, data = mpg, color = class) + theme_dpi()
qplot(mathSS, readSS, data = df[1:100, ], size = race, alpha = I(0.8)) + theme_dpi()
df$proflvl2 <- factor(df$proflvl, levels = c("advanced", "basic", "proficient",
"below basic"))
df$proflvl2 <- ordered(df$proflvl2)
qplot(mathSS, readSS, data = df[1:100, ], color = proflvl2, size = I(3)) + scale_color_brewer(type = "seq") +
theme_dpi()
mathSS
to, and waht can we map discrete characteristics like race
to?qplot(factor(grade), readSS, data = df[1:100, ], color = mathSS, geom = "jitter",
size = I(3.2)) + theme_dpi()
qplot(factor(grade), readSS, data = df[1:100, ], color = dist, geom = "jitter",
size = I(3.2)) + theme_dpi()
Aesthetic | Discrete | Continuous |
---|---|---|
Color | Disparate | colors Sequential or divergent colors |
Size | Unique siz | e for each value linear or logrithmic mapping to radius of value |
Shape | A shape fo | r each value does not make sense |
Aesthetic | Ordered | Unordered |
---|---|---|
Color | Sequential | or divergent colors Rainbow |
Size | Increasing | or decreasing radius does not make sense |
Shape | **does not | make sense** A shape for each value |
qplot(readSS, mathSS, data = df) + facet_wrap(~grade) + theme_dpi(base_size = 12) +
geom_smooth(method = "lm", se = FALSE, size = I(1.2))
qplot(readSS, mathSS, data = df) + facet_grid(ell ~ grade) + theme_dpi(base_size = 12) +
geom_smooth(method = "lm", se = FALSE, size = I(1.2))
colwheel <- "https://dl.dropbox.com/u/1811289/colorwheel.R"
dropbox_source(colwheel)
col.wheel("magenta", nearby = 2)
## [1] "plum" "violet" "darkmagenta" "magenta4" "magenta3"
## [6] "magenta2" "magenta" "magenta1" "orchid4" "orchid"
col.wheel("orange", nearby = 2)
## [1] "salmon1" "darksalmon" "orangered4" "orangered3"
## [5] "coral" "orangered2" "orangered" "orangered1"
## [9] "lightsalmon2" "lightsalmon" "peru" "tan3"
## [13] "darkorange2" "darkorange4" "darkorange3" "darkorange1"
## [17] "linen" "bisque3" "bisque1" "bisque2"
## [21] "darkorange" "antiquewhite3" "antiquewhite1" "papayawhip"
## [25] "moccasin" "orange2" "orange" "orange1"
## [29] "orange4" "wheat4" "orange3" "wheat"
## [33] "oldlace"
col.wheel("brown", nearby = 2)
## [1] "snow1" "snow2" "rosybrown" "rosybrown1" "rosybrown2"
## [6] "rosybrown3" "rosybrown4" "lightcoral" "indianred" "indianred1"
## [11] "indianred3" "brown" "brown4" "brown1" "brown3"
## [16] "brown2" "firebrick" "firebrick1" "chocolate" "chocolate4"
## [21] "saddlebrown" "seashell3" "seashell2" "seashell4" "sandybrown"
## [26] "peachpuff2" "peachpuff3"
+scale_color_brewer(palette"X")
library(grid)
p1 <- qplot(readSS, ..density.., data = df, fill = race, position = "fill",
geom = "density") + scale_fill_brewer(type = "qual", palette = 2)
p2 <- qplot(readSS, ..fill.., data = df, fill = race, position = "fill", geom = "density") +
scale_fill_brewer(type = "qual", palette = 2) + ylim(c(0, 1)) + theme_bw() +
opts(legend.position = "none", axis.text.x = theme_blank(), axis.text.y = theme_blank(),
axis.ticks = theme_blank(), panel.margin = unit(0, "lines")) + ylab("") +
xlab("")
vp <- viewport(x = unit(0.65, "npc"), y = unit(0.73, "npc"), width = unit(0.2,
"npc"), height = unit(0.2, "npc"))
print(p1)
print(p2, vp = vp)
Embed one plot in another plot in R using two different data elements from our data set. For example, plot a histogram of readSS
inside a scatterplot of readSS
and mathSS
Explore some examples on the ggplot2 website. What are some ways to overlay more than 3 dimensions of data in a single plot?
What types of data work best for what types of visualizations?
It is good to include the session info, e.g. this document is produced with knitr version 0.8
. Here is my session info:
print(sessionInfo(), locale = FALSE)
## R version 2.15.2 (2012-10-26)
## Platform: i386-w64-mingw32/i386 (32-bit)
##
## attached base packages:
## [1] splines grid stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] Hmisc_3.10-1 snow_0.3-10 gbm_1.6-3.2 survival_2.36-14
## [5] caret_5.15-044 foreach_1.4.0 cluster_1.14.3 reshape_0.8.4
## [9] lme4_0.999999-0 Matrix_1.0-10 lattice_0.20-10 xtable_1.7-0
## [13] gridExtra_0.9.1 sandwich_2.2-9 quantreg_4.91 SparseM_0.96
## [17] mgcv_1.7-22 eeptools_0.1 mapproj_1.1-8.3 maps_2.2-6
## [21] proto_0.3-9.2 stringr_0.6.1 plyr_1.7.1 ggplot2_0.9.2.1
## [25] lmtest_0.9-30 zoo_1.7-9 knitr_0.8
##
## loaded via a namespace (and not attached):
## [1] codetools_0.2-8 colorspace_1.2-0 compiler_2.15.2
## [4] dichromat_1.2-4 digest_0.5.2 evaluate_0.4.2
## [7] formatR_0.6 gtable_0.1.1 iterators_1.0.6
## [10] labeling_0.1 markdown_0.5.3 MASS_7.3-22
## [13] memoise_0.1 munsell_0.4 nlme_3.1-105
## [16] RColorBrewer_1.0-5 reshape2_1.2.1 scales_0.2.2
## [19] stats4_2.15.1 tools_2.15.1
This work (R Tutorial for Education, by Jared E. Knowles), in service of the Wisconsin Department of Public Instruction, is free of known copyright restrictions.