Data Science : Getting Started With Orange Tool

Yash Alpeshbhai Patel
4 min readSep 8, 2021

--

Orange Data mining Tool (Credit)

Orange Tool

Orange is a data visualisation, machine learning, and data mining toolkit that is open-source. It comes with a visual programming front-end for exploratory data analysis and interactive data visualisation, as well as the ability to utilise it as a Python library. Orange is a component-based software suite that excels at machine learning and data mining, specifically visualisation. Widgets are the components that cover everything from visualisation to pre-processing, evaluation, and predictive modelling.

You can do the following with Orange:

  1. Display a data table and pick features.
  2. Analyse the information.
  3. Compare and contrast learning algorithms and predictors.
  4. Data items should be visualised.

How to use workflows in Orange ?

Orange Workflows are made up of components (widgets) that read, analyse, and visualise data. Orange widgets are data analysis workflow building elements that are assembled in Orange’s visual programming environment. Widgets are classified into classes based on their purpose.A typical workflow may mix widgets for data input and filtering, visualisation, and predictive data mining.

Let’s begin by developing a simple workflow for any dataset. You can either use one of Orange’s built-in datasets or import one of your own.

Step-1: Using widgets library select File widget

Widgets Library

I utilised the built-in dataset iris in this case. This is the first stage in a straightforward procedure.

File Widget

Step-2: After importing the dataset with the File widget, we create a flow between File-Data Info Widget by dragging a line from File widget to Data Info (Creates Channel). Simply drag and drop the widget into the canvas and make a link between two widgets.

Flow between two widgets

The Data Info widget is used to gain information about the data that has been loaded. It also shows the dataset’s row count, column count, and targets, as well as the dataset’s name, size, features, description, row count, column count, targets, and data characteristics.

Data Info

Similarly then we establish a flow between File-Data Info, File-Data Table, File-Distributions, and File-Scatter Plot. We can design a simple workflow in Orange this manner.

Workflow

We also use Select column or Select row widgets to filter out unnecessary rows and column.

Column filter

Select Column

Select Columns

Here Sepal length and Sepal width are ignored and remaining two features and 1 target variables are present in dataset instead of five columns as we see in Data Info figure above.

After Column filter, Data Info

Scatter Plot Widget

It is used to plot scatter graph between desire features. We also change X-axis and Y-axis values, we also directly get informative projections and also plot in color regions

Scatter Plot

Data Table widget

Use the Data Table widget to view your data in tabular format.

Data Table

Distributions Widget

To acquire a graphical representation of the dataset values, utilise the data Distribution widget. The distribution of different features from the dataset may be easily viewed here. The distribution based on petal width Split by Iris can be seen in the image below. Similarly, you can use different combinations between features and target variables to visualise your datasets.

How to load your data or external data from API in Orange?

To load tabular data from URL or local machine we can use File widgets itself. In below image we can see there is two options File and URL if we need to load data from local machine we have to choose file and on right side of file there is folder icon using it we can load from local machine. For url we have to choose URL.

Another option is we can use CSV File Import widget to import CSV file.

File widget (URL or Local)
CSV import

So that’s it! This is the basic understanding of the Orange tool.

--

--