# 5 Shortcuts in R You Need To Know (Part 1)

Part 1: magrittr, ifelse(), ggtheme, mapply() and anonymous function.

#### 1. Magrittr Operator

The syntax in R has lots of nested function calls and parentheses. This can lead to readability issues.

Let’s say you have data of people’s hair and eye colors. You want to 1) create a two-way contingency table , 2) express the frequencies as probabilities, and 3) round the numbers to one decimal digit.

In vanilla R, you need to do the following:

round(prop.table(table(eye_color, hair_color)), 1)

The pipe operator *%>%** *from a library called *magrittr *(also loaded in *dplyr*), allows you to do the same task by chaining operations **in the order they are performed**:

table(eye_color, hair_color) %>% prop.table() %>% round(1)

The code just got much more readable. And you’ll feel the difference while coding. No need to match parentheses. Instead of thinking inside out, think sequentially.

There are nuances when you pipe something into another function with multiple arguments. Check out the manual.

#### 2. ifelse( )

Let’s say you want to give free tickets to anyone under 13, you can do the following:

free <- NA

for (i in 1:length(age)) {

if (age[i] < 13)

free[i] <- "Yes"

else

free[i] <- "No"

}

The R-ight way to do it is:

free <- ifelse(age < 13, "Yes", "No")

*ifelse* initializes a new vector and all its elements in one line of code.

#### 3. ggtheme

*ggplot* allows you to create nice looking, publication quality graphics. But you may need to write many lines of code.

The *theme** *methods in *ggplot *are used* *to modify the overall look of a graphic. An example might be:

ggplot() + ... +

theme(legend.position = c(1, 1), legend.justification = c(1, 1))

A library called *ggtheme* allows you to skip the coding and use templates inspired by the best graphic producers, like *The Wall Street Journal*, *FiveThirtyEight, *etc.

Here’s a default *ggplot*:

ggplot() + geom_point(aes(hp, wt), data = mtcars)

Let’s apply a *Wall Street Journal* template:

... + theme_wsj()

You can try a *FiveThirtyEight* template:

... + theme_fivethirtyeight()

**Here’s a ****list of available templates****.**

Note that while this package solves the issue of creating a template, you still have to customize other elements like labels, scale, color, type of chart, etc.

#### 4. mapply( )

Let’s say you created a function to calculate the power of a hypothesis test under a certain condition. The function is called *estimatePower* and takes two arguments. You specify *a real mean and a sample size*.

What you really want to do is to see how powerful the hypothesis test is under *multiple *real means and *multiple *sample sizes:

real_means <- c(100, 150, 200, 250, 300, 500)

sample_sizes <- c(10, 30, 60, 100, 200, 500, 2000)

To do so, you want to call the function many times.

You can do the following:

results <- numeric(6 * 7) # 42 possible conditions

iter <- 1

for (i in 1:6)

for (j in 1:7) {

results[iter] <- estimatePower(real_means[i], sample_sizes[j])

iter <- iter + 1

}

A much faster way can be achieved by using *mapply*:

results <- mapply(estimatePower, real_means, sample_sizes)

No need to initialize a vector to store the results and no need to write a nested for loop.

**It works like magic when you need to call a function using many combinations of conditions.**

#### 5. Anonymous Function

The family of *apply *methods are some of best shortcuts in R, allowing you to perform an operation across multiple rows or columns of data at once. A common usage is to summarize a statistic for each column variable:

apply(data, 2, mean)

The following is equivalent:

apply(data, 2, FUN = mean)

Here’s the takeaway: *you can define any custom function *right after* FUN**.*

Let’s say I want to know which column variables show little-to-no variance — perhaps to select the variables that are useful for a later task. I can do the following:

apply(data, 2, FUN = function(x) {

ifelse(var(x) < 0.5,

"Low-variance",

"Non-low-variance")

}

)

In the summary output, all the variables that have variance of less than 0.5 are labeled “Low-variance” and all others are labeled “Non-low-variance”.

Note that the function itself is unnamed and cannot be reused, hence *anonymous*. In general, anonymous functions are used one-time only in a particular place.

Another example. let’s say you want to produce a scatterplot and relabel the y axis scales to “Low” if they correspond to a number less than or equal to 3, and to “High” otherwise:

ggplot() +

geom_point(aes(hp, wt), data = mtcars) +

theme_wsj() +

scale_y_discrete(breaks = 0:6,

labels = function(x) {

ifelse (x <= 3, "Low", "High")

}

)

Here, the anonymous function modifies the *breaks *along the y axis* *that correspond to numerical values.

You might be confused as to what *x *in the function declaration corresponds to.

In the *apply* example, it corresponds to *each column* of data since we specified the second argument (*MARGIN*) to be 2.

In this example, *x *corresponds to *breaks, *since *labels *can modify only *breaks**.*

**Being able to create a function on-the-fly is useful for customizing many base R functions and extending the features of libraries.**

Learn more here.