Tutorial — How to Create a Dataframe From Vectors in R From Scratch?
Learn how to create a data frame in R and export it to a CSV format!
R is the first choice for many statisticians and data scientists when it comes to statistical computing and data mining. R is a programming language specially made for statistical computing and visualization. It was developed by Ross Ihaka and Robert Gentleman in 1995 at the University of Auckland in New Zealand, where the name “R” was derived from the first letters of their names. R is free and open-source software that runs across all major platforms.
R is an object-oriented programming language, it has various operators and functions that allow users to collect, explore, analyze, and visualize data. R mainly focuses on its statistical and graphical capabilities. When you learn R for data science, you’ll learn how to use the language to perform statistical analyses and develop stunning data visualizations. R’s statistical functions also make it easy for users to clean, import, and analyze data.
In this tutorial, we’ll create a dataframe from scratch using the R programming language. I will use the same dataset as in the previous article. You can read that article here. This is how our data looks like —
What is Dataframe?
Dataframe is a two-dimensional table that holds the alphanumeric values. In the dataframe, each column contains the value of one variable and each row contains the value of each column. A dataframe can be stored numeric data, character data, or factor type data.
What is a Vector in R?
A Vector is the most basic and simplest data structure in R. A vector is defined as the sequence of data with the same datatype. In R, a vector can be created using the c() function. The c() function stands for concatenate. It doesn’t create vectors — it just combines them. R vectors are used to hold multiple data values of the same datatype and are similar to arrays in C language. There are several types of vectors —
- Numeric vector
- Integer vector
- Character vector
- Logical vector
- Factor vector
- Datetime vector
Each column in the dataframe must contain an equal number of the data elements of the same datatype. The dataframe can be converted from vectors in R. To create a dataframe in R using the vector, we first will create a series of multiple vectors containing data. The data.frame() function is used to create a data frame from vector in R.
First, we’ll create multiple vectors named Rank, City, Population, Country, Statistical Concept, Area in km2, and Population Density in km2 as follows —
Now, we’ll combine all the vectors and create a dataframe using the following command. The R programming language, by default, converts characters(strings) to factors when creating a dataframe directly with data.frame(). Hence, to avoid this and store data more efficiently, we use the stringsAsFactors() argument by keeping it as FALSE.
You will get the output as follows —
Getting Structure of the dataframe —
You will get the output as follows. Here we can see that all the objects and variables are marked correctly according to their datatype.
Let's check the summary of the dataframe —
You will get the output as follows —
Now, we’ll expand our dataframe by adding a new column named Continent with the following command —
You will get the output as follows —
Let's check the class of the newly created column —
let's export our dataframe into a CSV format —
Conclusion —
In this article, we learned how to create a dataframe from vectors in R. If you liked my article or have any suggestions for me, please let me know by commenting below.
Thank You!