A Post on P6
A declarative language for machine learning in visual analytics.
In their IEEE 2020 paper, Jianping Kelvin Li and Kwan-Liu Ma present P6, a declarative language for building high performance analytics systems. Declarative languages in computer science are programming languages in which (ideally) a program specifies what is to be done rather than how to do it. In their paper, Li and Ma show that “P6 can empower more developers to create visual analytics applications that combine machine learning and visualization methods for data analysis and problem solving.”
P6 is designed to be both simple and intuitive for a variety of applications involving analytics applications. The language supports many common machine learning and visualization methods and is thus motivated by three different major design goals:
- Interactive machine learning and visualization: Machine learning methods are commonly used in visual analytics for automated analysis whereas interactive visualizations support human-guided visual analysis. P6 is designed to support both of these components.
- Interactive and scalable systems: Visual analytics demand high performance systems for processing and visualizing large datasets. Thus, P6 leverage parallel processors and distributed computing to facilitate effective analysis processes.
- Declarative visual analytics: Declarative grammars are useful for creating interactive visualization applications because they defer implementation and focus on specification.
P6's design is based on a visual analytics generation framework. The language renders data by performing operations in a visualization pipeline model. It uses employs distributed computing to perform scalable data processing and machine learning.
The authors demonstrate the use of P6 for an applictation involving COVID-19 global cases. Using data provided by Johns Hopkins University, the authors use change point detection for finding time points where the spread rate of COVID-19 exhibits big changes. Change point detection detects large shifts in time series trends that can be easily identified via the human eye, but are harder to pinpoint using traditional statistical approaches. The declarative specification for using change point detection to analyze the COVID-19 dataset is shown in the figure below.
Machine learning algorithms, by their nature, have high complexity, which can easily yield poor performance for large datasets. By virtue of its use of progressive visual analytics, P6 provides an effective approach for interactive analysis of big data. It delivers incremental results to maintain an acceptable level of responsiveness. Additionally, progressive visual analytics systems allow users to interact with the results and steer the process. Another advantage of using the P6 declarative language is that programmers will not need to set up a server or remote services to perform significant analysis. However, this does not mean they are barred from doing so when using P6. They still can be used in order to support the services provided and perform the necessary processes.
References
- Li, J. K., & Ma, K.-L. (2020) P6: A declarative language for integrating machine learning in visual analytics. IEEE Transactions on Visualization & Computer Graphics, 2020. https://pubmed.ncbi.nlm.nih.gov/33125330/