Data Analysis and Graphics Using R - An Example-Based Approach

John Maindonald and John Braun     3rd edn, Cambridge University Press, May 2010    

Additional Notes

Be aware that the notes on linear computations, on generalized linear models and on classification, are technically demanding.
Overheads -- Multilevel models Overheads for a talk on multilevel models. Note that later slides canvass fairly technical issues.
Chapter 10, using lme() R code & output, with brief commentary, for using lme() (package nlme) in place of lmer() (lme4)
Computations - Linear and GAM models These describe the computational methods used by lm() and by the gam() function in the mgcv package. They show how to make direct use of R's suite of functions (qr() and friends) for working directly with the QR matrix decomposition. They give technical detail that underpin the calculations of chapters 6-9. [Lund]
Generalized Linear Models Brief notes on the theory of generalized linear models, and on the comparison with linear models. [Lund]
Regression in practice Issues for the practical use of regression methods, supplementing the discussion in the text. [Lund]
Smoothing terms in GAM models Automated choice of smoothing parameter for smoothing terms in models with independent normal errors, in logistic regression models, and in Poisson regression models. This expands the very brief discussion at the end of Chapter 7.
Classification Notes on the theory that underpins the functions lda() and qda() in R's MASS package. For the case of two outcome classes, comparisons are made with logistic regression using glm(). Additionally, there are comparisons with randomForest(), from the randomForest package. In a range of methodologies from parametric to nonparametric, lda() and qda() are at the extreme parametric end, while randomForest() is about as non-parametric as is possible.
Ordination -- Low-Dimensional Representation These notes describe distance measures, representation in a Euclidean space (metric scaling), and multi-dimenensional scaling.
Analysis of microarray data The package DAAGbio has a vignette, and associated files and datasets, that demonstrates the analysis of the two-channel microarray data that are described in Section 4.4.1.
Spatial methods in R This has notes and slides that may be helpful in making a start on the use of R packages for spatial methods -- a topic that is not covered in the book. Code is provided for the graphs that are included in the overheads.
R talks to LaTeX Use knitr or Sweave to process a document that includes R code within Sweave type markup, to generate a LaTeX document that may include any or all of R code, output, tables and graphs. The more flexible knitr markup may now be preferred. For knitr, production of a Markdown html document is an alternative.

[Lund] These notes were developed for use with a PhD course, conducted under the STINT program, in the Centre for Mathematical Statistics at the University of Lund (Sweden) in May-June 2007.