Data Analysis with R programming: Data types in R programing

Bilal Nuhu
4 min readDec 19, 2022

--

R is a programming language that is often used for statistical analysis and data manipulation. One of the important aspects of working with data in R is understanding the different data types that are available. In this article, we will discuss the different data types in R and provide examples of how to work with them.

Data Analysis with R programming: Data types in R programing

Atomic vectors (vectors)

Atomic vectors are the most basic data type in R. They can contain elements of the same type, such as numeric, character, or logical values.

Here are some examples of atomic vectors in R:

# Numeric vector
x <- c(1, 2, 3, 4)

# Character vector
y <- c("apple", "banana", "cherry")

# Logical vector
z <- c(TRUE, FALSE, TRUE, FALSE)

You can access individual elements of an atomic vector using indexing. For example, to access the second element of the x vector, you would use x[2]. You can also access a range of elements using the : operator. For example, x[2:4] would return the second, third, and fourth elements of the x vector.

Lists

Lists are another common data type in R. They can contain elements of different types, including other lists. Lists are created using the list() function.

Here is an example of a list in R:

my_list <- list(1, "apple", c(2, 3, 4), TRUE)

You can access elements of a list using indexing, just like with atomic vectors. For example, to access the second element of the my_list list, you would use my_list[[2]].

Factors

Factors are used to represent categorical data in R. They are created using the factor() function.

Here is an example of creating a factor in R:

x <- c("apple", "banana", "cherry", "apple", "banana")
y <- factor(x)

n this example, the x vector contains character values representing fruit names. The factor() function converts these values into a factor, which is a special type of vector that can only contain a limited set of values.

You can access the levels of a factor using the levels() function. For example, levels(y) would return a character vector containing the levels "apple", "banana", and "cherry".

Data frames

Data frames are a common data type in R that is used to store tabular data. They are similar to lists, but each element must have the same length. Data frames are created using the data.frame() function.

Here is an example of creating a data frame in R:

name <- c("Alice", "Bob", "Charlie")
age <- c(25, 30, 35)
gender <- c("Female", "Male", "Male")

df <- data.frame(name, age, gender)

In this example, the df data frame has three columns: name, age, and gender. You can access individual columns of a data frame using the $ operator. For example, df$age would return a vector containing the age values for each row in the data frame.

Matrices

Matrices are another data type in R that is used to store tabular data, but they must have the same data type and the same number of rows and columns. Matrices are created using the matrix() function.

Here is an example of creating a matrix in R:

x <- c(1, 2, 3, 4, 5, 6)
y <- matrix(x, nrow = 2, ncol = 3)

n this example, the y matrix has two rows and three columns, and it contains the values 1, 2, 3, 4, 5, and 6. You can access elements of a matrix using indexing, just like with atomic vectors. For example, y[1,2] would return the element in the first row and second column of the matrix.

Arrays

Arrays are similar to matrices, but they can have more than two dimensions. They are created using the array() function.

Here is an example of creating an array in R:

x <- 1:24
y <- array(x, dim = c(2, 3, 4))

In this example, the y array has three dimensions, with dimensions of size 2, 3, and 4. You can access elements of an array using indexing, just like with matrices. For example, y[1,2,3] would return the element in the first dimension, second row, and third column of the array.

Other data types

There are several other data types in R that are less commonly used, but still worth mentioning. These include:

  • NULL: represents an empty value
  • NA: represents a missing or undefined value
  • NaN: represents a not-a-number value

Working with data types in R

R has several functions that can be used to work with data types. Some common ones include:

  • class(x): returns the class of the object x
  • mode(x): returns the mode (data type) of the object x
  • as.numeric(x): converts the object x to a numeric vector
  • as.character(x): converts the object x to a character vector

It is important to understand the different data types in R and how to work with them in order to effectively manipulate and analyze data. I hope this article has been helpful in introducing you to the different data types in R and providing some examples of how to work with them.

--

--