The goal of reproducible research is to tie specific instructions to data analysis
and experimental data so that scholarship can be recreated, better understood and
verified.
R largely facilitates reproducible research using literate programming; a document
that is a combination of content and data analysis code. The
Sweave
function (in the
base R utils package) can be used to blend the subject matter and R code
so that a single document defines the content and the algorithms.
Basic packages can be structured into the following groups:
-
LaTeX Markup
:
The
Hmisc,
xtable
and
tables
packages contain functions to write R objects into LaTeX
representations.
Hmisc
also includes methods for
translating strings to proper LaTeX markup (e.g., ">=" to
"$\geq$"). Animations can be inserted into LaTeX documents
being converted to PDF via the
animation
package. The
tikzDevice
and
pgfSweave
packages can convert R graphics to native LaTeX code while the
pictex
function in the base grDevices package is a
PicTeX graphics driver. The
makesweave
package for
Linux streamlines the generation of Sweave files using
make
.
-
HTML Markup
:
The
R2HTML
package
has drivers that allow
Sweave
to process HTML documents via
Sweave.
Both
R2HTML
and
hwriter
can be used to build HTML pages sequentially.
R2HTML,
xtable
and
hwriter
can also convert some
R objects into HTML representations.
-
ODF Markup
:
The
odfWeave
package extends
Sweave
to the
Open Document Format
.
Word processing tools, such as OpenOffice.org, can then be used to blend content and programs.
Many word processors can be used to translate the ODF document to other formats
(e.g., Word, PDF, HTML, etc.)
-
Microsoft Formats
:
The
R2wd
and
R2PPT
packages for Windows can be used to communicate
between R and Word or PowerPoint
via the COM interface. Document elements (e.g. sections, text, images, etc) that
are created in R can be inserted into the document from R. Commercial R products
that work with RTF and/or Word are
RTFGen
,
Inference for R
and
SWord
(installed using the package
SWordInstaller).The output from other
packages (odfWeave
and
R2HTML) can also be opened by Word.
RExcel
(installed using the package
RExcelInstaller) can integrate code with Microsoft Excel.
-
Plain Text Formats
:
R code and output in
Sweave
files can be converted into
AsciiDoc
and other structured
text formats using the
ascii
package.
-
Syntax Highlighting
:
The
highlight
package can render R
code with more control over the results (e.g., syntax coloring, etc) in LaTeX
and HTML.
The
SweaveListingUtils
package can also provide enhanced control over how
R code chunks and their output are rendered in LaTeX.
-
Caching of R Objects
:
The
cacheSweave
and
weaver
packages allow caching of specific
code chunks. The
cacher
and
R.cache
packages can also be used but are
not integrated with
Sweave.
pgfSweave
can also cache graphics.
The
SRPM
package (for shared reproducibility package management) creates
an R package that organize the results of an
Sweave
document into different
directories (e.g., article, figures, etc).
-
Others
:
The
brew
and
R.rsp
packages contain alternative approaches
to embedding R code into various markups.
knitr
is a comprehensive package
derived from
Sweave
that includes code formatting, highlighting,
caching, fine control of graphics, conditional evaluation, multiple
markup formats and other features.
An incomplete list of packages which facilitate literate programming for specific
types of analysis or objects:
-
The base R utils package has generic functions to convert objects to
LaTeX (via
toLatex) and BibTeX (via
toBibtex).
-
Functions for creating LaTeX representations of summary statistics and visualizations
can be found in the
Hmisc,
reporttools, and
r2lh
packages.
Hmisc
also has functions for marking up data frames and the
quantreg
and
memisc
packages can mark up matrices.
-
Cross-tabulations can be converted to LaTeX code using the
Hmisc
and
memisc
packages.
-
The
xtable
and
rms
packages provide LaTeX
representations of some common models (e.g., Cox proportional hazards model, etc.).
For example, processing an
aov
object with the
xtable
function will generate LaTeX markup of the ANOVA table. Similarly, methods exist
for
glm,
prcomp,
ts
and other types of objects.
-
The
quantreg
contains LaTeX markup functions for quantile regression
fit summaries.
-
Standardized exams can be created using the
exams
package
-
The
odfWeave.survey
and
TeachingSampling
packages provide
ODF and LaTeX functions, respectively, for survey sampling objects