Getting Started With Orange Tool

Varun
4 min readSep 27, 2021

--

What Is Orange Tool?

Orange is an open-source data visualization, machine learning, and data mining toolkit. It features a visual programming front-end for explorative data analysis and interactive data visualization, and can also be used as a Python library. Orange is a component-based software suite that excels at machine learning and data mining, specifically visualization. Widgets are the components that cover everything from visualization to pre-processing, evaluation, and predictive modeling.

The orange tool consists of various items to be used for machine learning and data science.

  1. Canvas: Provides UI for data analysis.
  2. Widget: Used to use the various items such as data, visualize or any machine learning modal.

Using of orange tool

Firstly, we open the Orange tool and create a simple workflow as shown in the image below with various widgets such as File, Data info, Data table, scatter plot, and Distributions. Once creating we connect them using the common connectors to the file.

Data Information

To get the information about the data loaded in the file widget we can create a flow between the File widget and use the Data Info Widget which shows the name, description, row count, column count, features and target values in the data-set in File widget.

Info for the selected data-set.

Now select the Data Table from the left panel and drag it to the canvas. we can drag a flow between the File and Data Table widget. Now click on the Data Table you can see the data of your data-set in tabular form. In the below image, the highlighted data is the target variable.

We can get a Visualization of the data-set from the data table.

Data Distribution

Use the Data Distribution widget to get the graphical representation of the data-set values. Here I got the distribution for various features from data set.

distribution around petal width.

You can observe that for the feature like petal width the data is not clearly distributed for the target variable category, but on selecting filter based on the sepal width the data is distributed properly to three different categories.

Distribution around sepal-width.

We can also use the widget of Scatter Plot for plotting for different kinds of feature pairs. In the below image Scatter Plot is plotted for the feature pair of petal length and petal width.

scatterPlot Projections.

We can also click on Find Informative Projections which will give us best projections we can have automatically.

Suggested best-projections.

Here we have used Iris Data-set provided by the Orange Tool but You can upload your data from API in Orange Tool.

To load your data in Orange select the File Widget and from there in you can either select the data-set provided by Orange or else browse to the data-set file in your local machine to load the data.

If you want to load external data use can select the URL option in the File widget, where one can paste the external data-set link to load the data.

--

--