I was determined to *exactly* reproduce a composite box-and-whisker plot that I had been seeing in the book *Introduction to Statistical Learning*. The data are from a set of 1,250 observations called **Smarket** that is found in the R package that goes with the book, known as `ISLR`

. It’s a pretty simple plot:

First, after loading the package with `library(ISLR)`

, one may want to take a cursory look at the data frame, which has 9 variable columns, using `View(head(Smarket))`

.

I was able to successfully put together the plot with this code:

ylabel <- "Percentage change in S&P"
xlabel <- "Today's Direction"
valnames <- c("Down", "Up")
hue <- c("blue", "red")
layout(matrix(c(1, 2, 3), nrow = 1, ncol = 3, byrow = TRUE))
boxplot(Lag1 ~ Direction, data = df,
ylab = ylabel, xlab = xlabel,
names = valnames,
col = hue,
main = "Yesterday")
boxplot(Lag2 ~ Direction, data = df,
ylab = ylabel, xlab = xlabel,
names = valnames,
col = hue,
main = "Two Days Previous")
boxplot(Lag3 ~ Direction, data = df,
ylab = ylabel, xlab = xlabel,
names = valnames,
col = hue,
main = "Three Days Previous")
layout(1)

I realised, however, that I was seriously violating the **DRY principle**, so I tried to come up with a function instead. I struggled a bit with this because I didn’t know how to supply a character vector argument and place it into the “*formula*” bit that is required as the first argument of the version of the `boxplot()`

function I had used i.e. the `y ~ x`

part.

After scouring the documentation a bit – I tried out `as.name()`

, messed around with `sQuote()`

and `dQuote()`

, all to no avail – I discovered the documentation behind `?formula`

, and BAMMM!, I got it.

This is how I put it to get the exact same plot:

box.stock <- function(dat, column, tt) {
par(mfrow = c(1, 3))
for (i in 1:3) {
boxplot(as.formula(paste(column[i], "~ Direction")),
data = dat,
ylab = "Percentage change in S&P",
xlab = "Today's Direction",
names = c("Down", "Up"),
col = c("blue", "red"),
main = tt[i])
}
}
choice <- c("Lag1", "Lag2", "Lag3")
title <- c("Yesterday", "Two Days Previous", "Three Days Previous")
box.stock(df, choice, title)

Now how’s that?

This experience helped me to better appreciate the value of using functions in general. Of course, with a little tweaking, the above code could be used to draw many more plots, indeed as much as is within available computing or application resources.

### Like this:

Like Loading...

*Related*

Reblogged this on The Opportunist.

LikeLike