Visualization with Machine Learning

Harshal Bangar
Globant
Published in
7 min readDec 23, 2022

Business Intelligence (BI) and Machine Learning (ML) are among the best technologies that address business-driven-advanced analytics challenges. While BI visualization deals with complex calculations, it has limitations compared to Python-driven ML. This is where visualization with ML comes into the picture. It allows you to incorporate more advanced analytics and ML algorithms into your already powerful visualizations. In this blog, we will discuss the integration of ML with data visualization.

What is Data Visualization?

Data visualization is translating information into a visual context, such as a map or graph, to make data easier for the human brain to understand and pull insights from any kind of data.

The main goal of data visualization is to make it easier to identify patterns, trends, and outliers in large data sets.

What can BI do for businesses?

BI technologies use advanced statistics and predictive analytics to help businesses extract conclusions from data analysis, discover patterns and forecast future events in business operations. BI reporting is not a linear practice; rather, it is a continuous, multifaceted cycle of data access, exploration, and information sharing.

What can ML do for businesses?

ML can manage the heavy data lifting necessary for businesses to get to the core of their performance. For example, ML algorithms can identify the factors contributing to and detracting from your brand health by analyzing your data from every angle. ML is unique because it can quickly identify relationships that may not be immediately apparent or intuitive to humans.

Why do we need to integrate ML into BI visualization tools?

  • BI software provides a powerful outlet for cleaning, curating, and visualizing data. Coupled with ML technology, it can help business users uncover a layer of insight often overlooked by even the most experienced analysts.
  • ML can easily and quickly perform tasks that are too tedious for humans to do. Image and speech recognition are the most vivid examples of such tasks. ML has achieved quite good results in performing those tasks once exposed to sufficient training phases.

The following are the advantages of ML algorithm on Business Intelligence reports after integration :

  • Automation of tasks: ML can handle routine tasks freeing up analysts and consultants to be more productive with their time.
  • Data Quality: ML systems can operate with limited to no human intervention and thus can make decisions to auto-correct mistakes and deal with issues.
  • Self-Service: Business intelligence tool sets will reach new levels of self-service, reducing the need for technical knowledge and offering simpler ways of interacting with data. Systems will be able to learn based on a user’s preferences and previous activities, offering a tailored service for each user.

Integrating Machine Learning in BI Tools

Integrating Python in BI tools is an excellent feature, as you can show what your Python code is doing and how it connects to data in the form of visualizations. With the growth of cross-functional teams, this is a breakthrough for BI, data analysts, and scientist roles.

Power BI and Tableau are among the best BI Visualization tools that address business-driven-advanced analytics challenges.

Here we focus on Tableau Python integration with the business use case and its installation in the system.

Tabpy (Tableau Python integration): Tabpy allows you to incorporate more advanced analytics such as time series and ML algorithms into your already powerful visualizations. This is an excellent feature, as you can show what your Python code has done and how it connects to data through visualizations. The growth of cross-functional teams is a breakthrough for Business Intelligence, data analysis, and data science roles.

Tableau + Python integration

So what exactly is Tabpy? It is an analytics extension implementation that expands Tableau capabilities by allowing users to execute python scripts and saved functions via Tableau’s table calculations.

When do you use Tabpy? We can define calculated fields in Python. We can leverage the power of a large number of ML libraries straight from your visualization platform. This Python integration with Tableau enables an extremely powerful scenario.

Benefits of Tabpy

Tabpy uses the popular Anaconda environment which is preinstalled and ready to use many Python packages Pandas, NumPy, and Sklearn, but you can install any Python library in our script.

Tabpy Uses Cases

TabPy makes it possible to use Python scripts in Tableau-calculated fields. When you pair Python’s ML capabilities with the power of Tableau, you can rapidly develop advanced-analytics applications that can aid in various business tasks.

Sentiment Analysis using TabPy

Businesses today are heavily dependent on text data. The majority of this data is unstructured text coming from sources like emails, chats, social media, surveys, articles, and documents. While analyzing it is important to monitor such attributes it could help to discover the sentiment of customers.
Instead of using traditional labeling, we can use Natural language processing from python. Using the Natural Language Toolkit (NLTK ) from Natural language processing (NLP) makes computers understand the unstructured text and retrieve meaningful pieces of information such as sentiment and opinion.

Proposed Solution

Valence Aware Dictionary and Sentiment Reasoner(Vader) is an open-sourced package within the NLTK. It is quite successful when dealing with social media texts. It uses a combination of a sentiment lexicon is a list of lexical features (e.g., words) that are generally labeled according to their semantic orientation as either positive or negative. The lexicon approach means that this algorithm constructed a dictionary that contains a comprehensive list of sentiment features.

VADER in Tableau Calculated Field

We can create an external service calculation this is called the script function in tableau. We don’t have to train anything to use this library, we will create a list of sentences, here is the Comment Text on which we will apply sentiment analysis using the polarity_scores() method from SentimentIntensityAnalyzer class. The polarity score returns a float for the sentiment strength based on the input text ranging from -1 to 1.

Calculated filed for Sentiment Intensity Analyzer

Sentiment analysis on Amazon product review using Tabpy

With Tabpy we can use Vader sentiment analysis for Amazon product review Data. We are using polarity scores on Comment text to know the sentiment of customers for the following feedback. By using filters to see just the negative reviews and review their content to understand the reasons behind them. From ML we are getting an NLP algorithm for sentiment analysis and we can color-code those scores on Tableau visualization for a better understanding.

Sentiment Analysis on Tableau

Installing Anaconda on Windows

We are using Anaconda because it has a package manager, an environment manager, and a Python distribution containing many open-source packages. This is advantageous as when you are working on a data science project; you will find that you need many different packages (NumPy, Sklearn, Pandas), with which an installation of Anaconda comes preinstalled.

Go to the Anaconda Website and choose a Python 3 graphical installer (A) or a Python graphical installer (B). If you aren’t sure which Python version you want to install, choose Python 3. Do not choose both.

  • Memory: minimum RAM size of 16 GB RAM.
  • Storage: Recommended minimum of 100 GB.

By Installing Anaconda on a local machine, we can Run Jupyter and Spyder for Python coding. From the start menu, Search for Anaconda Prompt and open it. Execute the following commands to create a virtual environment and then activate it.

Anaconda command prompt for tabpy installation

From the activated TabPy environment, execute the following commands. You will have to confirm the process for the first command.

create - - name virtualenv
python -m pip install - - upgrade pip
pip install tabpy

The final step is getting your Tableau Desktop to communicate with the local process running TabPy. Open your Tableau Desktop + and navigate to Help > Settings & Performance > Manage External Service Connection.

Connecting Tableau with an extension connection

The TabPy confirmation message in the terminal window mentioned that “web service listening on port 9004” provides the server of “localhost” and port “9004”. If you click “Test Connection” and get the message “Successfully connected to the external service” then congratulations! You are ready to use Python in Tableau.

Tabpy Local Host setup

Conclusion

By integrating BI with ML, we will open the door to endless possibilities to automate and improve your current data analytics setup. We can also use it to integrate deep learning models into an analytics dashboard, perform complex statistical tasks, and implement continuous integration and development.

Reference link

--

--

Harshal Bangar
Globant
Writer for

Conquering challenges with Passion and Creativity