Structural Equation Modeling with the sem package for R, Writing R Packages, Teaching Statistics Using R and the R Commander
John Fox
(McMaster University)
IQS Barcelona
January 2016
Short URL: tinyurl.com/IQS-R-Course
Instructions for Installing R, R Studio, and R Packages
The R statistical programming language and computing environment has become the de-facto standard for writing statistical software among statisticians and has made substantial inroads in the social sciences -- it is now possibly the most widely used statistical software in the world. R is a free, open-source implementation of the S language, and is available for Windows, Mac OS X, and Unix/Linux systems.
The basic R system is developed and maintained by the R Core group, comprising 20 members, many of them eminent in the field of statistical computing. The R Project for Statistical Computing is a project of the R Foundation, whose membership includes the R Core group and several other individuals, and is also associated with the Free Software Foundation.
I'll cover three R-related topics that are essentially independent of each other: An introduction to structural-equation modeling that uses the sem package for R; an introduction to writing R packages; and a discussion of using R and the R Commander graphical user interface for teaching statistics course.
Resources
Dates | Topics | Related Readings |
Materials |
Jan. 19-21 | structural-equation models with the sem packages | on-line appendix to Fox and Weisberg, An R Companion to Applied Regression, Second Edition, on structural-equation models | script, notes, problems; answers; data files: Lincoln.R, Rindfuss.R, Wheaton.R |
Jan. 22 (IQS) & Jan. 26 (Univ. de Barcelona) |
writing and building R packages | Writing R Extensions manual | script, notes, matrixDemos.R, matrixDemos_1.0-5.tar.gz, matrixDemos_1.0-5.zip |
Jan. 25 | teaching with R and the R Commander | notes, data file: States.txt, sample problem |
Selected (English) Bibliography
Publishers of statistical texts have been producing a steady stream of books on R. Of particular note is Springer's Use R! series of brief paperbacks on various R-related topics. Similarly, Chapman and Hall/CRC Press has The R Series.
Also see the package listing on CRAN and the various CRAN "task views."
R Manuals
R is distributed with a set of manuals, which are also available at the CRAN web site.
A manual for S-PLUS Trellis Graphics (also useful for the lattice package in R) is at also available on the web.
A great deal of information about using the RStudio interactive development environment is available on the RStudio website.
Programming in R (and S)
R. A.
Becker, J. M. Chambers, and A .R. Wilks, The
New S Language: A Programming Environment for Data Analysis and
Statistics.
J. M.
Chambers, Programming with Data: A Guide to the S Language.
J. M. Chambers, Software for Data Analysis: Programming with R. New York: Springer, 2008. Chambers’s newest book ranges quite widely, and emphasizes a deep understanding of the R language, along with object-oriented programming, and links between R and other software. Some topics are unusual, such as processing text data in R.
J. M.
Chambers and T.J. Hastie, eds., Statistical
Models in
D. Eddelbbuettel, Seamless R and C++ Integration with Rcpp. New York: Springer, 2013. Judicious use of compiled code written in C, C++, or Fortran can substantially improve the efficiency of some R programs. The Rcpp package and its cousins simplify the process of integrating C++ code in R. I recommend this book to those who already know C++.
R. Gentleman, R Programming for Bioinformatics, Boca Raton: Chapman and Hall, 2009. A thorough, though at points relatively difficult, treatment of programming in R, by one of the original co-developers of R and a founder of the related Bioconductor Project (which develops computing tools for the analysis of genomic data). Don’t let the title fool you: Most of the book is of general interest to R programmers.
G. Grolemund, Hands-On Programming with R, Sebastopol CA: O'Reilly, 2014. A readable, easy-to-follow, basic introduction to R programming, which also introduces RStudio.
R. Ihaka and R. Gentleman, “R: A language for data analysis and graphics.” Journal of Computational and Graphical Statistics, 5:299-314, 1996. The original published description of the R project, now quite out of date but still worth looking at.
W. N.
Venables and B. D. Ripley, S Programming.
H. Wickham, Advanced R. Boca Raton FL: Chapman and Hall/CRC, 2015. Hadley Wickham has contributed a number of widely used R packages (such as ggplot2 for graphics and plyr for data manipulation) and is associated with RStudio. As the name implies, you may (and should!) be interested in reading this book after you’ve learned the basics of R programming. A related volume by Wickham, R Packages, Sepastopol CA: O'Reilly, 2015, is (as its name implies) about how to write R packages. Wickham's approach to R programming is sometimes idiosyncratic but always carefully considered and interesting. The websites for the books provide access to the text. Hadley Wickham is a member of the R Foundation.
Xie, Y., Dynamic Documents with R and knitr. Boca Raton FL: Chapman and Hall/CRC, 2013. Yihui Xie describes the use of his knitr package for creating LaTeX documents with embedded executable R code. This package also provides the basis for R Markdown in RStudio.
Other R Sources (Some Free)
See the publications list on the R web site. The R Journal, the journal of the R Project for Statistical Computing, and its predecessor R News, are also good sources of information, as is the Journal of Statistical Software, an on-line American Statistical Association journal dominated by coverage of R packages.
A Source in Spanish
Maribel Peró Cebollero, David Leiva Ureña, Joan Guàrdia Olmos, Antonio Solanas Pérez, Estadística aplicada a las ciencias sociales mediante R y R-Commander, Garceta grupo editorial (2012).
Some Readings on Structural Equation Models
K. A. Bollen, "Latent Variables in Psychology and the Social Sciences", Annual Review of Psychology, 2002, 53: 605-634. Provides a good brief overview of latent-variable models.
K. A. Bollen, Structural Equations with Latent Variables (Wiley, 1989). Although it is now a bit dated, it's still my favourite book-length treatment of SEMs.
J. Fox, "Linear Structural-Equation Models" (Chapter 4, of Linear Statistical Models and Related Methods, Wiley, 1984). This is the basis of much of my lecture-slide material.