How can I produce an effective scientific figure in R? ggplot2 barplot.
Good figures are the hallmark of great publications or presentations. Good figures can provide a lot of information that furthers the points of an argument regardless of discipline. Producing a good figure is not trivial. Just as with other aspects of scientific inquiry, deliberate thought must go into how the data would best be presented, whether best through a table, histogram, or, in this guide, a bar plot. Bar plots are good for demonstrating how a quantitative variable varies between groups or categories. I will illustrate how to use it by showing how to produce it using ggplot2, which is a package readily available through R, by plotting the instances of particular names differentiated by sex.
1.Import the ggplot2 library so we can make use of all the useful tools. If you do not have this library, you can following this guide on installing it. After installation, run the below code to ensure you can access its tools.
2. The ggplot2 library relies on producing figures from data frames in R, which is a type of data container. Given all the tools available from ggplot2, if you consider making a plot with your data at all, I would recommend starting from a data frame. However, you can form a data frame using vectors in a manner that I will show below. The methods that I have demonstrated below do the exact same thing, in different ways. A key thing to keep in mind when making a data frame is that the dimensions of the columns have to all be the same. So if you have 100 names in one column, you MUST have 100 details in the other.
#Below is how to produce the data frame using vectors
names = c('Bob','Oscar','Helen','Cindy','Bob')
sexes = c('Male','Male','Female','Female','Male')
df = data.frame("Name"=names, "Sex"=sexes)#Below is how to produce the data frame without using vectors
df = data.frame("Name"=c('Bob','Oscar','Helen','Cindy','Bob'),
3. When working with small datasets, producing a data frame without vectors is a reasonable approach, but producing it with vectors is more flexible and better for larger datasets. In addition, if you plan on transforming your data in several different ways, I find it easier to store it in a vector rather than updating the data frame initialization every time. While you can edit a data frame after you have made it, I personally prefer to work on vectors as it involves less use of indices that can increase the risk of errors down the line if you later do drastic changes to your data as I commonly do. If you are interested in learning how to edit a data frame, I have shown how to do this below with a comparison of doing it in a vector.
#editing the data frame entry at a particular cell. The first bracket is accessing the first column with a double bracket denoting you want to access the actual internal vector. This does not work with a single bracket. The second bracket is accessing the second row. A double bracket is not needed for the second bracket as there is only one item to access versus a list.
>df[] = 'Ron'
 "Bob" "Oscar" "Helen" "Cindy" "Bob"
 "Oscar"#editing the vector
>names = 'Ron'
4. Now we will create our plot. You can do this in two ways, which I will demonstrate below.
q = df %>% ggplot(aes(x=Name,fill=Sex)) + geom_bar()
q = ggplot(df,aes(x=Name,fill=Sex)) + geom_bar()
5. Congratulations, you just formed your first more complex plot using ggplot! As mentioned before, I personally like ggplot because it is very flexible and I can design a plot however I want. For the above figure, if I did not wish to fill the bars by sex, I could simply leave that parameter out and it would produce:
6. Thus I heartily recommend its use to any seeking to make more powerful figures regardless of discipline. While R can be tricky to learn, there is copious documentation available online on how to make use of it so even beginners can produce some great looking figures.
In this article, we discussed the importance of good figures and went through how to produce such a figure that can be readily adapted to our needs using the library ggplot2 in R.