Diving into Data: Installing R & RStudio on your PC

Introduction

This is the first in a series on getting started with data, primarily data analysis and visualization. The entire series is designed for individuals just getting started with data, and is written with the assumption that those who will find this series most useful are those who have little to no experience working with data, but are looking to get started somewhere.

I personally prefer to use R for the bulk of my data work. You’ll meet people who tell you to start with Excel, and you’ll meet people who tell you to start with Python. You can do that! There are more than enough tutorials out there to get you started with whatever tool you’re interested in learning.

When you’re starting out with data, there’s no right or wrong answer as to whether you should use R, Python, or Excel — what’s important is to get started with something, anything, and figure out what you like and don’t like. You can always change and/or pick up additional tools later — what’s important now is developing the foundations of working with data.

Installing R on your PC

  • Navigate to the “Comprehensive R Archive Network” (CRAN): https://cran.r-project.org/
  • Click on Download R for Windows.
  • Choose “base”, which refers to the base installation of R.
  • Click on “Download R 3.x.x for Windows” The x’s in 3.x.x will represent the version of R that you’re downloading, and will change.
  • Navigate to the download (Ctrl + J in Google Chrome), click on it, and follow your computer’s instructions to install R.

Why do I need RStudio?

OK, you’ve installed R, why do you need RStudio? Technically you can stop right here and work with R directly, but there are a few reasons that RStudio is helpful.

RStudio is an IDE, or Integrated Development Environment, which is a user interface which generally makes working with a particular programming language a lot easier. There are IDEs for every programming language, and as you get more comfortable with programming you may find something you like better than RStudio.

Installing RStudio on your PC

  • If you haven’t installed R yet, stop! Go back and install R.
  • Navigate to RStudio: https://www.rstudio.com/
  • Click on the option to “Download RStudio” — if you can’t find it, you can use this direct link: https://www.rstudio.com/products/rstudio/download/
  • You’ve got a lot of options to choose from here, but all you need is the free version. Scroll to the bottom and either click on the green “Download” link, or choose the RStudio 1.0.153 — Windows Vista/7/8/10 link from the Installers section.
  • Navigate to the download (Ctrl + J in Google Chrome), click on it, and follow your computer’s instructions to install R.

Getting comfortable with RStudio

We’re not going to get into anything too fancy here, but hey, you’ve done the work of getting R and RStudio onto your PC, why not try a few things?

  • Set up your color scheme by going to Tools > Global Options > Appearance
  • Set your code up to “soft-wrap.” Soft-wrapping your code will adjust your code so that it moves to the next line without introducing a paragraph break (which can interrupt your code). Go to Tools > Global Options > Code and select “Soft-wrap R source files.”
  • You can run code directly in the Console window by typing the code directly into the console, and hitting enter. Try each of the following:
> “hello, world!”

Notice that you don’t have to type the “> ” into your code.

> hello, world!

What happens when you omit the quotes around the text?

> 5 + 2

The console functions as a calculator

> x <- 5
> x + 2

The “<-” is what’s called an assignment operator. It stores the information on to the right of the “<-” into the variable on the left.

> x <- 10
> x + 2

R allows you to overwrite variables with new information.

> x <- "hello, world!"
> x

You can also store text as a variable!

Next Steps

In the next set of the series we’ll be pulling in a small data set and doing what’s called cleaning and exploratory data analysis (EDA), including a graph or two. It’s rare that you’ll be working with a perfectly clean data set, so we’ll focus on building your cleaning skills early and often.

Feedback

Your feedback will only make this series more effective for learning, so let me know what would be helpful to you!