The Data Visualization Process by Ben Fry: A Simplified Journey

5 min readAug 5, 2024

In the world of data visualization design, understanding the data and the intended purpose of the visualization are key. This is because how data is presented can directly influence interpretation and decision-making. Thus, as discussed earlier, design’s importance in effectively communicating data cannot be underestimated. Sometimes, the ideal visualization is found by changing the chart that presents the data, and sometimes by changing the color palette. In this context, understanding the processes of building data visualizations is essential to ensure that the message conveyed is clear, accurate, and relevant to the target audience.

Here, however, a caveat is in order: data visualization construction processes are important, as long as they help you effectively meet the demands and informational needs of those who will use the charts you build. Furthermore, it is also worth noting that, although I have already presented to you the importance of visualization functions, instead of separating visualizations into explanatory and exploratory, Ben Fry proposes a unified process that encompasses seven interconnected steps, forming a cohesive path to transform raw data into meaningful visual insights.

The 7 Steps of Ben Fry’s Process

Let’s imagine that we are dealing with the records of a company’s commercial negotiations. This example is relevant because the sales process can be quite complex, involving several products, periods, customers, and transaction values. Analyzing this data to obtain valuable insights requires a structured and detailed process.

If your goal is to analyze sales, what is the first step you should take? Collect this same data, right? The “Acquisition” step is like when you collect all the company’s commercial negotiation records. Whether by analyzing all the receipts, by recording them in a spreadsheet (whether accounting or not), or by extracting the data from the system in which you record the transactions, you need to obtain all the data somehow.

After collecting the data, we need to organize it, because the data is not always (or rather, rarely) ready for analysis or in the best structure to be transformed into visualizations. In this “Parsing” step, the records are classified by categories, such as dates, products, transaction values, customers, etc. This means giving meaning and structure to the data so that it can be understood and used effectively.

Now that the data is organized, it is time to select what is relevant for analysis. In the “Filtering” step, we choose only the data that we want to analyze. Imagine that we want to focus on the transactions of a certain period, for example, the last quarter, and of a specific category of products. Thus, we remove the records that do not fit these criteria, leaving only the relevant information. This step is also commonly called purging.

Next, we apply “Mining”. Here, we identify patterns or trends in the data, such as which products had the most sales or which customers purchased the most during the analyzed period. This stage uses mathematics and statistics to find patterns and contexts in the data that contribute to understanding the overall scenario and — why not? — provide the insights that business analysts so desire.

Since no one is convinced just by looking at numbers, from the next stage onwards the concern is no longer the data or the operations performed with it, but rather how those who will consume the data will perceive, analyze, and understand the data. From this point on, design becomes more valuable than ever, as we need to decide how to display it. The “Representation” stage is how to choose a way to visualize our data. We can use a bar graph to show sales by category or a line graph to illustrate sales over time. The idea here is to understand how the user goes through their own analytical journey to build graphs that help them take action and make decisions.

Next, we refine our visualization. The “Refine” step involves adding details and tweaks to the chart to make it clearer and more appealing. We can add different colors for each product or highlight one product over the others, and add explanatory captions and labels with exact sales figures. This all depends on the combination of all the visualizations so that we can highlight the important parts and make the visualization more understandable.

Finally, we add “Interaction.” Imagine that you can click on a bar in the chart to see more details about the sales of a specific product, or select a date range to see sales only for that period. Interaction allows you to explore the data in different ways and discover more information.

These steps are not isolated. They connect and influence each other. For example, the way we represent the data may make us rethink which data we filter. Interaction can lead to further refinements. This iterative process ensures that the final visualization is clear and useful, allowing for adjustments as new questions arise or as the audience interacts with the data.

Additional Golden Tips

Every project has unique requirements. Just as every area is different and has different information needs, every data set needs a customized approach. There is no point in using the same visualization for all data because each data set tells a different story. Therefore, adapt your visualizations to the specific needs of the project and the audience. For this reason, Ben Fry also emphasizes the importance of avoiding off-the-shelf solutions, as there are no single solutions that fit all situations. Take a line chart as an example. Perhaps one area wants to see the projection of temporal performance for the next periods, while another prefers to compare a certain category with another. Each data visualization project has its peculiarities and requires a specific approach. Ready-made solutions may not adequately meet the specific needs of your project.

Avoid information overload. In data visualizations, less detail can mean more clarity and objectivity — which is crucial for the main information and insights to be properly communicated. Visualizations with too much detail can become confusing. Therefore, use only the data necessary to convey your message effectively. For example, if your business negotiations dashboard includes too many irrelevant details and unnecessary charts, it can become cluttered and difficult to interpret, preventing you from quickly identifying important insights.

Know your audience. When creating a visualization, it’s important to think about who will be looking at it. If you’re showing your business negotiations to an executive, you need to make it clear and focused on the most important points for decision-making. The visualization should be accessible and relevant to the person who will be using it. For example, an executive is looking for strategic information, while an analyst is looking to solve problems or find insights. Failure to consider the target audience can result in a visualization that doesn’t meet their specific needs, negatively impacting the outcome.

Conclusion

In summary, Ben Fry’s process guides us from gathering data to creating a polished, interactive visualization. His process, however, is not the only one. Over the next few weeks, I’ll introduce other processes, culminating with tips and suggestions on how you can create your own. The bottom line, however, will always be the same: to equip you with the tools and power to create effective data visualizations that communicate complex information in an accessible and impactful way. Until then, however, tell me: what did you think of this process? Was it useful? Can you see steps in it that are relevant to the way you build charts today?

The Data Visualization Process by Ben Fry: A Simplified Journey

Written by Antonio Neto