DPI R Bootcamp
Jared Knowles
In this lesson we hope to learn:
Why do we do this?
It isn’t much harder than the basic analysis itself
install.packages('knitr','markdown')
; sweave
is part of R base alreadymyscript.R
#' This is some text
#'
# + myplot, dev='svg',out.width='500px',out.height='400px'
library(ggplot2)
data(diamonds)
qplot(carat, price, data = diamonds, alpha = I(0.3), color = clarity)
#' Diamond size is clearly related to price, but not in a linear fashion.
#'
o <- spin("C:/Path/To/myscript.R", knit = FALSE)
knit2html(o, envir = new.env())
myscript2.R
#' This is some text that I want to explain
#' For example, this plot is important, let's look below
# + myplot,
# dev='svg',out.width='500px',out.height='400px',warning=FALSE,message=FALSE
library(ggplot2)
load("PATH/TO/MY/DATA.rda")
qplot(readSS, mathSS, data = df, alpha = I(0.2)) + geom_smooth()
#' There is not a linear relationship, but it sure is close.
#' Let's do some regression
#'
test <- lm(mathSS ~ readSS + factor(grade), data = df)
summary(test)
#' It's all statistically significant
o <- spin("C:/Path/To/myscript2.R", knit = FALSE)
knit2html(o, envir = new.env())
# We specify that new environment is used to carry out the analysis, not
# the current environment
spin
## title: My Super Report ## Author: Mr. Data ##
# A plot and some text
library(ggplot2)
load("PATH/TO/MY/DATA")
qplot(readSS, mathSS, data = df, alpha = I(0.2)) + geom_smooth()
# Now a linear model
test <- lm(mathSS ~ readSS + factor(grade), data = df)
summary(test)
##
## Call:
## lm(formula = mathSS ~ readSS + factor(grade), data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -170.47 -43.35 1.21 45.45 194.45
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 198.343 7.252 27.35 < 2e-16 ***
## readSS 0.478 0.015 31.96 < 2e-16 ***
## factor(grade)4 30.837 4.324 7.13 1.3e-12 ***
## factor(grade)5 34.225 4.112 8.32 < 2e-16 ***
## factor(grade)6 62.517 4.418 14.15 < 2e-16 ***
## factor(grade)7 72.468 4.265 16.99 < 2e-16 ***
## factor(grade)8 96.530 4.650 20.76 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 64.4 on 2693 degrees of freedom
## Multiple R-squared: 0.497, Adjusted R-squared: 0.496
## F-statistic: 444 on 6 and 2693 DF, p-value: <2e-16
# Ok!
# Markdown
stitch("PATH/TO/MY/SCRIPT", system.file("misc", "knitr-template.Rmd", package = "knitr"))
knit2html("Path/To/My/Markdown.md")
# Direct 2 Html
stitch("PATH/TO/MY/SCRIPT", system.file("misc", "knitr-template.html", package = "knitr"))
# Direct to PDF Requires LaTeX
stitch("PATH/TO/MY/SCRIPT")
knitr
package to give us in-depth control over R HTML files# Start .Rmd file on next line
My Super Report on Student Testing
------------------------------------
Dr. Debateman
==============
In this report I plan to show you all the results of student testing in Myoming.
#```{r chunksetup, include=FALSE} (remove # in actual document)
load("PATH/TO/MY/DATA.rda")
source("myscript.R")
library(ggplot2)
#```
The most important thing to look at is this plot:
#```{r plot1,dev='png',fig.width=9,fig.height=6}
qplot(readSS,mathSS,data=df)
#```
And my model output can be included a few ways because it is so great.
#```{r mystatmodel,results='markup'}
mymod<-lm(readSS~mathSS+factor(grade),data=df)
summary(mymod)
#```
#```{r mystatmodel2,results='asis'}
mymod<-lm(readSS~mathSS+factor(grade),data=df)
print(xtable(summary(mymod)),type='html')
#```
And because I am awesome, I am done.
knit("PATH/TO/myscript.Rmd", envir = new.env())
knit2html("Path/To/Myscript.md")
<style type="text/css">
body, td {
font-size: 14px;
}
r.code{
font-size: 10px;
}
pre {
font-size: 10px
}
</style>
dev
argument where dev
is equivalent to the graphic type we want to export# PDF
pdf(file = "PATH/TO/MYPLOT.PDF", width = 10, heigh = 8)
print(qplot(readSS, mathSS, data = df, alpha = I(0.2)))
dev.off()
# PNG
png(file = "PATH/TO/MYPLOT.png", width = 1200, heigh = 900)
print(qplot(readSS, mathSS, data = df, alpha = I(0.2)))
dev.off()
foreign
library, save
, write.csv
, and write.dta
write.csv(df, file = "PATH/TO/MY.csv")
write.dta(df, file = "PATH/TO/MY.dta")
# save in the R file
save(df, file = "PATH/TO/MY.rda", compress = "xz")
table(df$female, df$schid)
##
## 6 15 45 66 75 105
## 0 219 222 225 225 234 243
## 1 231 228 225 225 216 207
xtable
package provides a good way to do this output6 | 15 | 45 | 66 | 75 | 105 | |
---|---|---|---|---|---|---|
0 | 369 | 387 | 384 | 378 | 375 | 390 |
1 | 81 | 63 | 66 | 72 | 75 | 60 |
schoolhigh | schoolavg | schoollow | readSS | mathSS | |
---|---|---|---|---|---|
schoolhigh | 1.00 | -0.52 | -0.23 | -0.03 | 0.02 |
schoolavg | -0.52 | 1.00 | -0.71 | 0.04 | -0.07 |
schoollow | -0.23 | -0.71 | 1.00 | -0.02 | 0.06 |
readSS | -0.03 | 0.04 | -0.02 | 1.00 | 0.63 |
mathSS | 0.02 | -0.07 | 0.06 | 0.63 | 1.00 |
require(xtable)
print(xtable(table(df$ell, df$schid)), type = "html")
xtable
reformats the table to an xtable objectprint
exports the xtable object to the screen so whatever document processing system we use can grab itxtable
objects, which helpsxtable
can be HTML or LaTeX output, giving flexibility for how we build our documentapsrtable
for beautiful looking regression tablesHmisc
has some nice output functions as well for HTML in particularpandoc
or slidify
to create HTML5 slides in RIt is good to include the session info, e.g. this document is produced with knitr version 0.8
. Here is my session info:
print(sessionInfo(), locale = FALSE)
## R version 2.15.2 (2012-10-26)
## Platform: i386-w64-mingw32/i386 (32-bit)
##
## attached base packages:
## [1] splines grid stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] snow_0.3-10 gbm_1.6-3.2 survival_2.36-14 caret_5.15-044
## [5] foreach_1.4.0 cluster_1.14.3 reshape_0.8.4 lme4_0.999999-0
## [9] Matrix_1.0-10 lattice_0.20-10 xtable_1.7-0 gridExtra_0.9.1
## [13] sandwich_2.2-9 quantreg_4.91 SparseM_0.96 mgcv_1.7-22
## [17] eeptools_0.1 mapproj_1.1-8.3 maps_2.2-6 proto_0.3-9.2
## [21] stringr_0.6.1 plyr_1.7.1 ggplot2_0.9.2.1 lmtest_0.9-30
## [25] zoo_1.7-9 knitr_0.8
##
## loaded via a namespace (and not attached):
## [1] codetools_0.2-8 colorspace_1.2-0 compiler_2.15.2
## [4] dichromat_1.2-4 digest_0.5.2 evaluate_0.4.2
## [7] formatR_0.6 gtable_0.1.1 iterators_1.0.6
## [10] labeling_0.1 markdown_0.5.3 MASS_7.3-22
## [13] memoise_0.1 munsell_0.4 nlme_3.1-105
## [16] RColorBrewer_1.0-5 reshape2_1.2.1 scales_0.2.2
## [19] stats4_2.15.1 tools_2.15.1
This work (R Tutorial for Education, by Jared E. Knowles), in service of the Wisconsin Department of Public Instruction, is free of known copyright restrictions.