Data Science Series | Getting Started With Orange Tool

Karan Patel

Published in

Geek Culture

5 min readSep 11, 2021

Learn how to load and visualize the data using the Orange tool.

What Is Orange Tool?

Orange is a data visualization, machine learning, and data mining toolkit that is open-source. It comes with a visual programming front-end for exploratory data analysis and interactive data visualization and the ability to utilize it as a Python library. Orange is a component-based software suite that excels at machine learning and data mining, specifically visualization. Widgets are the components that cover everything from visualization to pre-processing, evaluation, and predictive modeling.

Using Orange Tool, we can carry out the following tasks:

Display a data table and pick features.
Analyze the information.
Compare and contrast learning algorithms and predictors.
Data items can be visualized.

Installation

Orange comes inbuilt when you download the Anaconda Environment in your machine. We can download and install Orange explicitly as well; for that we need to run the following command:

pip install orange3

If you’re using python provided by Anaconda, then run the following commands:

conda config --add channels conda-forge
conda install orange3

For further information regarding downloads, click here

How To Use Orange Workflows

Orange Workflows are made up of components also known as widgets that read, analyze, and visualize data. Orange widgets are data analysis workflow building elements that are assembled in Orange’s visual programming environment. Widgets are classified into classes based on their purpose. A typical workflow is a mixture of widgets for data input and filtering, visualization, and predictive data mining.

Now, let’s generate a workflow for the well-known IRIS dataset. You can either use one of Orange’s built-in datasets or import one of your own.

Step-1: Using widgets library, select File widget

Widget options in the left pane of the orange window

Now, after double-clicking the File, we need to select the dataset, here we have utilized the IRIS dataset.

Step-2: After importing the dataset with the File widget, we create a flow between File-Data Info Widget by dragging a line from File widget to Data Info (Creates Channel). Simply drag and drop the widget into the canvas and make a link between two widgets.

The Data Info widget is used to gain information about the data that has been loaded. It also shows the dataset’s row count, column count, and targets and the dataset’s name, size, features, description, row count, column count, targets, and data characteristics.

Similarly, then we establish a flow between File-Data Info, File-Scatter Plot, File-Data Table, and File-Distributions. We can design a simple workflow in Orange in this manner.

We also use Select column or Select row widgets to filter out unnecessary rows and columns.

Select Column

As shown here, we can shift the features into the available variables category or into the Target Variable category as per requirement.

Scatter Plot Widget

It is used to plot scatter graphs between desire features. We also change X-axis and Y-axis values, we also directly get informative projections and also plot in color regions.

Data Table widget

Use the Data Table widget to view your data in tabular format.

Distributions Widget

To acquire a graphical representation of the dataset values, utilize the data Distribution widget. The distribution of different features from the dataset may be easily viewed here. The distribution based on petal width Split by Iris can be seen in the image below. Similarly, you can use different combinations between features and target variables to visualize your datasets.

How to load your data or external data from API in Orange?

To load tabular data from URL or local machine, we can use the File widget itself. In the below image, we can see there are two options, File and URL if we need to load data from the local machine, we have to choose a file, and on the right side of the file, there is a folder icon using it we can load from the local machine.

If you want to import a dataset available online, you need to add the URL of that dataset into the URL field to import the dataset. If you want to import a dataset that is already downloaded on your machine, then you just need to select the browse option, which is next to Reload option and select the appropriate dataset as per your requirement and then load it into your workflow. Another option is we can use the CSV File Import widget to import CSV files.