DATA STORIES | BOOK RATING APP | KNIME ANALYTICS PLATFORM

How to Build a Book Rating Data App with KNIME

Using the Refresh Button Widget node to add interactivity to your dashboards

Ivan Prigarin
Low Code for Data Science

--

Photo by Alfons Morales on Unsplash.

Note 1: The content of this blog post is applicable to KNIME Analytics Platform version 4.4

Note 2: This blog post is accompanied by a collection of workflows that correspond to each section. Each workflow sequentially extends the previous one, with the final workflow representing the complete application.

Preamble: Recommendation Engines and Data Collection

Data collection is an important part of any data science project. If you want to develop a recommendation engine, you first need to get a hold of some data. If KNIME Analytics Platform is your preferred tool for data science, now you can create a data collection application without even leaving it, code-free.

Eventually, we would like to create a recommendation system for books, which would use data collected with an application similar to the one discussed in this article to suggest new books to users based on their previously read books.

Assuming that an initial book database is available, a book rating application needs the following features:

  • Search. You should be able to search the book directory using title, author, or ISBN as keywords.
  • Rate & Review. After searching for a book, you should be able to assign a rating to selected books from the search results.
  • Correct. If you realize you made an error, the app must offer you the chance to change/remove existing ratings.

You can preview the data app that we will build in Figure 1. In accordance with the steps outlined above, in the animation we first search for a book and give it a rating of 5. Then we transition to the screen showing our rated books, where we first change the rating from 5 to 4, and then remove the book altogether.

Fig. 1. The three steps of a book rating system as implemented in the final application.

As mentioned in the preamble, in this article we will describe all the steps required for the implementation of such a rating application, codeless, using the KNIME software. The full set of workflows implemented for this solution can be found in the “Book Rating Data App” space on the KNIME Hub.

The key to such an application is the Refresh Button Widget node. With its recent introduction, it has become possible to build increasingly complex single-page applications in KNIME Analytics Platform — the data apps. Briefly put, the Refresh Button displays a button in the page that triggers re-execution of all the nodes downstream from it, on-demand. For instance, a new selection in the view of the Scatter Plot node can be propagated down to a Tile View node within the same page instance.

Initial setup

Before moving on with the implementation of the application, let’s go through the foundational elements upon which the project will be built.

The dataset

For this project, we will be using the Goodreads dataset available on Kaggle. It contains over 11,000 books with features like Title, Author, ISBN, among others.

In our case, we will be focusing only on the English language titles — we can use the Row Filter node with a simple regular expression as the matching criteria to filter out books in other languages. The Column Filter node will let us keep only the columns that we are interested in. Finally, we will save the processed data locally into a KNIME table to allow for easy access down the road. The workflow “1. Data Preprocessing” is available on the KNIME Hub.

Fig. 2. Dataset preprocessing.

Data storage

Our data app will need to be able to efficiently retrieve, update, and perform some trivial queries on the data. An adequate solution is to use an SQLite database, which is portable, lightweight, and easy to set-up. Notice that any other database would work as well.

We will need three tables within the database to reflect the functional structure of the data app: “books”, “users”, and “ratings” (Figure 3). Using the preprocessed dataset from the previous section, we can populate the “books” table right away, while the other two tables will be gradually filled throughout multiple workflow executions.

The workflow “2. Database Setup” that populates the database is also available on the KNIME Hub.

Fig. 3. The tables and the workflow for setting up our SQLite database.

At the end of this sequence, we will have an SQLite file containing the book directory, as well as the additional tables needed to implement the functionality of our data app.

Implementation

The data app relies on being able to navigate from one screen to another, with dynamic content that depends on the actions performed on other screens (e.g., rating a book while in search mode will make it appear in the list of rated books, which is located on another screen). With the semantics defined, and the data-related foundation set, we can start implementing the wireframe of the application in the KNIME Analytics Platform.

The process can be separated into four stages:

  1. Building the navigation panel,
  2. Building the search engine,
  3. Implementing the rating functionality for the search results,
  4. Building a view for rated books, with an option to change/remove them.

Let’s have a look at each step.

The navigation panel

The navigation panel can be populated using the Single Selection Widget node, which, upon being triggered by pressing the attached Refresh Button (“Switch”), will yield a flow variable specifying the screen to be displayed.

We can then use the CASE Switch Variable node for control flow, displaying page titles with the Text Output Widget nodes for the appropriate selection. Due to the current limitation imposed on the number of outputs of the CASE Switch nodes (these are now adjustable in KNIME Analytics Platform 4.5!), we separate the flow into two conceptual branches — one for the rated books view, and one for the three search views, which can then be separated into their own data streams accordingly.

Fig. 4. The workflow implementing the navigation panel.

Lastly, in order to display the widgets’ items, we need to encapsulate the workflow into a component. To make sure that everything is correctly placed in the composite view of the component, we can arrange the widgets within the Composite View Layout tab in the “Node Usage and Layout” dialogue.

Thanks to the CASE Switch nodes, only the views that are connected to the active output port will be visible, thus enabling the interactivity seen in Figure 5.

Fig. 5. The navigation panel in action.

This serves as the wireframe for the rest of the data app — each branch will hold the corresponding encapsulated component representing each feature set.

The complete workflow “3. Navigation Panel” for this section can be found on the KNIME Hub.

The search engine

At the core of our data app is the ability to search the book directory using three types of keywords: title of the book, its author, or its ISBN. Just like when defining reusable functions in a programming language, KNIME Analytics Platform allows you to create reusable, configurable sections of your workflow — components. In our case, we would like to build a universal component (let’s name it “Search”) that provides a search view and, depending on the search keyword, appropriately configures the encapsulated nodes with the help of flow variables. On the outside, this is an organic extension of the project segment we built earlier (compare Figures 4 and 6).

Fig. 6. The workflow implementing the search functionality. The reusable “Search” component replaces the placeholder Text Output Widget nodes we saw in Figure 4.

Just as most search engines do, we would like ours to provide suggestions as the user types. Inside the “Search” component, The Autocomplete Text Widget allows us to do precisely that, by indexing the provided table column and turning the data rows into possible values to be suggested.

The “Search” component accesses the SQLite database through the search query string. As a bonus, we don’t need to worry about our database queries returning empty-handed, since we can configure the Autocomplete Text Widget to only accept strings that it suggests.

Once again, we make use of the CASE Switch nodes for control flow. Note that, besides the expected usage where you enclose a portion of your workflow inside a pair of Start and End CASE Switch nodes, they bear a layer of additional functionality. For instance, you don’t have to complement a Start node with an End node, leaving the branches to roam freely. You can also combine the types of Switch nodes, e.g. Variable with Data, as seen in Figures 6 and 7.

Fig. 7. The inside of the “Search” component, which automatically adapts the nodes within for each search keyword type: title, author, or ISBN.

You can see the search engine in action in Figure 8 below.

Fig. 8. Demonstration of the search engine that dynamically provides suggestions based on user input.

The number of rows indexed by the Autocomplete Text Widget node is adjustable in the configuration window of the node. In our case, we have around 10,537 books in the database, so we set the “Maximum number of rows” setting to 11,000 up from the default 2,500.

The complete workflow “4. Search Engine” for this section can be found on the KNIME Hub.

Rating the search results

With the search functionality implemented, we can focus on the next phase of the project — the rating system. While there are a multitude of options available in the KNIME Analytics Platform when it comes to displaying and selecting data, we will utilize the Labeling View node. Built for Active Learning, this node allows users to dynamically apply labels to items in a given dataset. This exactly replicates the functionality that we would like to implement. All we need to do is to correctly define the labels to represent the rating options. By using flow variables, we can configure various node parameters, including the list of possible label values (see Figure 9). You can learn more about flow variables in the “Using Flow Variables” part of the KNIME documentation.

Fig. 9. We can specify the possible label values using a string-collection flow variable.

With the rating options mapped to labels, the search results, which are fed into the Labeling View node, will be displayed in a tile-like fashion with the labels presented as interactive buttons. By attaching a Refresh Button to the Labeling View node, we can propagate the titles that were rated downstream, to then perform a database update.

A small addition to the application is a login screen that is presented once the application is launched. With a String Widget node, we can obtain the username of the current user in order to match their ratings inside the database. While this simple solution is enough for this particular example, note that proper authentication via, for instance, the Credentials Widget node would be the preferred way of doing this in production.

As before, we utilize the now-familiar combination of the Rule Engine and CASE Switch nodes for control flow. You can see the new “Rate” component we implemented in this section in Figure 10, as well as an animation demonstrating the rating process in Figure 11.

The corresponding workflow “5. Rating Search Results” is also available on the KNIME Hub.

Fig. 10. The implementation of the rating functionality inside the “Rate” component.
Fig. 11. Performing the search-and-rate routine using the “Rate” component.

Viewing rated books and changing/removing ratings

The final part of the project involves implementing the ability to view the books the user has rated, and change or remove their ratings. With this part implemented, we will have completed the feature outline we set at the beginning of this blog post, resulting in an extensive example of building a single-page application with dynamic elements in KNIME Analytics Platform using the Refresh Button Widget node.

The particular way we are going to implement the Rated Books view entails a distinguishing technical detail, which will result in an additional layer of responsiveness at no cost to performance. When a user of an interactive application is presented with a list of items and an option to edit that list, they would expect to see the changes reflected in the list immediately after performing some action on it. In our case, we would query the rated books from the database and display them in a Table View node, then use a Refresh Button to propagate a selection from the View downstream to perform a database update (change or remove the rating for a particular book). While the database would now be up-to-date, the Table View node would still be displaying its previous state. Let’s have a look at how we can bridge this gap between the Table View and the database.

The flow of data in this particular section of the workflow is as follows:

  1. fetch a user’s rated books from the database,
  2. display them in a Table View,
  3. capture the selection from the Table View and update the database with the new rating, or remove that entry altogether.

To close the loop, we just need one more step — redraw the Table View using the updated data. And a loop is indeed what we can use, namely, the Counting Loop (Figure 12).

Fig. 12. The “Rated books” component enclosed inside a Counting Loop.

By setting the number of iterations to 2, we can perform two passes through the “Rated books” component every time the user presses the Refresh Button to update or remove a rating:

  • During the first pass, the user selection from the Table View, as well as from the Single Selection Widget, are propagated downstream and used to update the database.
  • The second pass reads the new state of the database, and re-renders the Table View to display it.

This way, any change the user performs will be instantly reflected in the Table View. Inside, we use a series of CASE Switch nodes to control which action to perform based on a number of flow variables.

Figure 13 demonstrates the final part of the project in action.

The corresponding workflow “6. Viewing Rated Books” is also available on the KNIME Hub.

Fig. 13. The Table View instantly reflects the changes committed to the underlying database.

Final remarks

With this example, we hope to have demonstrated the capabilities of KNIME Analytics Platform for building extensively interactive data apps for data collection, or for any other purpose. The high-level concepts, as well as certain implementation details, can be applied to a vast range of applications to increase their functionality, responsiveness, and versatility.

Moreover, these data apps, while perfectly capable when run locally using the Analytics Platform, can be extended to be run in the web browser when deployed to a KNIME Server, with a swathe of additional benefits, such as increased performance, shareability, and monitoring.

As already mentioned, all of the workflows mentioned in this article are available in the corresponding KNIME Hub space “Booking Rating Data App”.

What website will you replicate with your data app?

--

--