Visualizing Knowledge at Forge.AI
By: M Giles Phillips
Forge sources unstructured data and transforms it into structured event streams. These event streams are enriched with knowledge during the transformation process and delivered in real-time to describe the happenings of the world in a machine-ready way. Our structured event stream enables downstream AI systems to reason over our data, while enabling human analysts to develop and validate new hypotheses about the impact of specific events, or series of related events, in relevant domains.
In order to deliver this value, we must help our customers understand the scope and structure of our data and how to leverage it. Interactive data visualizations have emerged as an important aspect of our user experience because they inform, educate, compel exploration, and ultimately empower our customers to configure their data streams to meet their particular needs.
Challenge
The most immediate challenge when designing visualizations of Forge.AI data is the sheer scale of the data–it is both high volume and high velocity–which makes it easy to overwhelm users. Our structured event data also has significant richness, dimensionality, and variety, thus it is not easily depicted by standard visualizations. Truthful and accurate representation is of the utmost importance, making it crucial that data context is clearly established by any of our interactive visual experiences. Through our designs, we must spark exploration and discovery while allowing for informed configuration of event streams. Together, these goals mean that our interactive user experiences must not be overwhelming, conceptually complex, or difficult to understand.
Approach
Rather than focus on dashboards and configuration screens, our approach has been to design an intuitive sequence of interactive visualizations to enable customers to progressively discover the richness of our data, and then to specify data of interest, all in context. At particular points in the customer’s journey, focused visualizations are crafted to enable customers to accomplish specific goals. These points are defined by three specific customer journey stages:
What follows is a set of visualizations that we’ve been actively exploring to support the stages of this customer journey.
Stage 1: Discover
Forge data contains several different types of knowledge. Our data includes knowledge about all sorts of entities and concepts such as people, products, organizations, and locations. It also contains knowledge about how these different entities are related and specific types of events that have occurred between entities; for example, Person A is CEO of Organization B or Person A has left her role at Organization B. Beyond that, our data includes a number of calculated statistical measures reflecting, for example, the momentum or veracity of a particular story that describes an event, and various other metadata about the unstructured data that has originally been sourced. Taken together, these facets illustrate the different aspects of the Forge.AI event streams that we must visualize for our customers.
Treemaps are a versatile representation for hierarchical classification data, particularly when implemented in an interactive experience that enables users to expand into any particular node of interest. When paired with a breadcrumb of parent containers, such a visualization can afford easy forward/backward navigation of a complex hierarchical data set. Treemaps can also visualize several dimensions of data at each level, represented by size, color, and contextual labels. Taken together, these attributes make for a dense visualization of data that is highly explorable while reinforcing data context. However, this expressive capacity comes at a cost: treemaps are not immediately intuitive to many users and their specific details must be learned through experience.
Because treemaps have this upfront cognitive cost, we’ve explored ways to make them more intuitive. One option is to introduce the treemap in the context of a more traditional search or browse experience and then expand the treemap view once the user has expressed interest in it, as shown in Figure 3:
Here, the treemap is partially redundant to the list below but it serves as a quick, scannable visual map of the section being explored, while the list below helps explain to the user the meaning of a least one — and possibly more — visual elements in the treemap, in this case the correlation between box size and entity count.
We have explored other contextual visualizations as well. In the concept below, we are visualizing temporal information about a list of events. The time series visualization is interactive, to the extent that you can click into a unit of time and filter the list below. This provides the customer with a immediate and intuitive sense of volumes over time while also allowing quick and focused investigation of a particular moment in time.
The visualizations above have been designed to enable customers to explore our data, discovering not only specific insights but also general areas of interest. While the underlying Forge processing creates a real-time stream of information that is constantly updating, these discovery experiences will be fueled by a subset of recently processed historical data, indexed for an interactive system like the one depicted.
Stage 2: Configure
Our customers have diverse needs and require complete control over what types of data they’d like to stream. For example, a customer could include or exclude events that reference a particular Organization, while another customer may be interested in product announcements. Some customers may have just a few criteria, other customers may have many. What’s more, our customers require the ability to expand upon or refine these choices on an ongoing basis, as their own needs and interests evolve. We think of this sustaining user need as event stream configuration, and it’s crucial that our customers are able to create accurate and complete configuration settings at any stage in their journey.
Our customers need a flexible configuration experience that lets them assert their criteria, inspect / verify the data that their configuration provides, and then easily modify their settings. The configuration experience needs to be flexible, efficient, and easy to use.
In Figure 5 you can see one design pattern we’re exploring to enable customers to configure a set of filters to define what data we should stream to them on a real-time basis. In the design, the user has specified a concept of interest on the left-hand side, and is provided an immediate (historical) event data preview in the middle panel. In the right-hand panel we present a series of contextual cards to provide more detail about a selected event. These cards incorporate graphlets of relational knowledge and statistical data to help the user discover, understand and explore relationships while in the act of configuration.
Our design goal is that the user will be able to explore the data, evaluate the knowledge therein, and confidently make an informed configuration decision about this area of interest. With the addition of data in context, the search interaction enables the user to verify positive matches for a concept of interest, and also to shift her scope or focus based upon the data that is presented.
Stage 3: Leverage
Once our customers have configured exactly what data they’d like to receive from Forge.AI, they can begin to leverage this data to augment their business processes. Forge supports myriad business use cases, some of which involve interactive systems or visualizations that incorporate our data to support business or analytical decision-making.
The principal unit of knowledge represented in our data is an event. We create a structured representation of every event that we know about in order to enable our customers to make reasonable hypotheses or assertions using the data. This event data is designed to be consumed, either by downstream computing systems or via monitoring applications. Figure 6 is an exploration of the latter, composed of the various structural elements of our event data, using a combination of histograms, line graphs, and cards.
Figure 7 shows another, more complex, example: a visualization that builds upon our rich knowledge about entity relationships to show risk as a result of events that occur within a complex supply chain. This interface extends a classic miller column pattern with statistical and relational information. Such an interface could initially be implemented to flare or register signals where there is abnormal event volume, but could eventually be extended with analytics that would predict the timing and extent of impact caused by certain event types upon suppliers and consumers.
Next Steps
We’re early in our journey of visualizing Forge.AI knowledge in a way that is inspiring and enabling for our customers. A major area for future work is the continued expansion and improvement of our filtering experiences. Filtering is a crucial configuration step that requires contextual knowledge enrichment — our current UI prototypes will require significant additional iteration and testing. We’ll be evaluating different visualizations of our knowledge about the relationships between various entities. And, finally, we’ll be heavily focused on exploring new and different representations of the structured event data itself, as well as the ontologies that we use to classify events, in order to enable various use cases for our customers.
The concepts and prototypes that have been highlighted here will continue to evolve as we learn more from our users. Some concepts will grow into core aspects of our user experience, while others will be invalidated and shelved. Here on the blog, we’ll come back to the topic of data visualization from time-to-time as we iterate on our product experience, showcasing both our successes and our lessons learned.
Conclusion
Forge.AI data is consumed directly through an API or agent and is leveraged by all sorts of downstream systems; these systems don’t necessarily directly visualize our data. In fact, we have designed our data to be optimal for consumption into downstream data modeling processes. Even so, our ability to deliver interactive visualizations of our data is crucial. Our visualizations educate customers about the nature of our data, illuminate for customers powerful uses of our data through discovery and exploration, and enable the efficient streaming of data to customers.
As we’ve learned more about our customer’s goals, needs, frustrations and impediments, we’ve begun to identify a set of visual experiences that support specific user needs at particular moments in time. We’ve found that contextual displays of data and the progressive revelation of different aspects of the data have been the most promising approaches in our early concepts and prototypes.