All you need to know for getting started with orange-tool

Kamlesh Solanki
Analytics Vidhya
Published in
7 min readSep 20, 2021
Source

Orange is an open source, component-based software written in Python language that works best for machine learning and data mining — namely, visualization. These components are called widgets and they range from visualization to pre-processing, evaluation and predictive modeling.

You do not need to know programming to get started with orange and to do machine learning with orange tool all you need is to understand some basics of machine learning and you can do state of the art machine learning with orange tool.

You will get to know at the end of this article that how useful orange tool is and how fast we can do machine learning with orange.

Now without wasting our time let’s get started.

Installation

Orange is an open source software and is available for all platform i.e. Windows, linux and MacOs.

You can follow the official website to download it according to your Device.

You can download it through conda or pip.

for conda installation

conda config --add channels conda-forge
conda install orange3
conda install -c defaults pyqt=5 qt

For pip installation

pip install orange3

Now you can start orange tool GUI in 2 ways

  1. Simply write command in console
orange-canvas

2. Using Python Command

python3 -m Orange.canvas

Now we’re ready to get started with orange.

Overview

When You first start orange a blank canvas is viewed.

Orange Blank Canvas

The white region is our playground where we’ll perform all the task.

Orange Widgets

Everything we’ll goona do in orange is through it’s widgets.

There is a widget for everything from loading data to finally save and predict data. We’ll see some of them in this article also.

How to add widgets in canvas?

There are 2 ways you can add widgets in canvas.

  1. From the left most Widgets option
Widgets

You can simply double click on any widget and the widget will be seen in canvas.

2. You can simply right click on canvas and widgets will be shown.

There is a beautiful documentation for each and every widget you can see it here.

Loading Dataset

You can load comma seperated file(.csv) file or excell spread sheet or google sheets in orange. Use file widget to load the file.

File Widget

You can browse file location and load file in orange.For loading csv file use CSV file import.

For now Let’s import prebuilt iris dataset.

Loading iris dataset

Work Flow

Once the data is loaded now you can connect this widget to data table widget to see data in tabular format.

There are two ways to connect the widgets.

  1. You can first add widget and then link them.
  2. You can stretch the link from the widget and then search option will be available in which you can apply all the applicable widgets to current widget.

I’ll prefer the 2nd option.

Connection between widgets

Now double click on data table widget and you’ll be able to see data in tabular form.

Data table

Now we can see the data distribution using distribution widget. Just connect distribution widget with file widget and you’ll see the magic.

Adding distribution widget

After adding distributions widget just double click on distribution widget and you’ll see the the distribution of all the features in it.

Distribution

Here all the features distribution is shown in image sepal length’s distribution is shown on top-left side you can select the feature to see its distribution.

You can also plot the data with scatter plot to see the distribution in the plot.

Just connect scatter plot and file widget.

Adding scatter plot

Now open scatter plot widget and see the distribution.

Scatter plot

If you want to see certain samples in the region just select that region using mouse and connect data table widget to scatter plot widget and you’ll see all the selected samples.

Select some points

Connect data table widget.

Connect data table with scatter plot

Now open data table.

Selected points

Now In this data table scatter plot widget sends the selected data points and that data point is shown in data table widget.

So in orange each and every widget communicate to its connected widget in a realtime and send data to each other.

You can see other plots also in orange like box plot, bar plot and many more.

Building Model

Now let us do some machine learning with orange i.e make classifier.

Now I am connecting the tree widget to file widget to make a tree classsifier and further connect tree viewer widget to see the classification tree.

Adding Tree classifier

Now let’s see the classification tree you can double click on classification viewer and the classification tree will be visible.

Classification tree

Model Evaluation

Now we’ve seen how the classification tree looks like for iris dataset we want to evaluate our model for that we need to connect file widget to Test and score widget and connect tree widget to test and score widget to see the various matrices.

Adding Tree Widget to test and score

Now let us see the score.

score

Note that here I’ve selected cross validation with 10 folds you can experiment it with other numbers of folds and also validate it on separate test set also. Here you can see various matrices like AUC score, Classification accuracy, F1 score, Precision score, Recall score etc.

Now we can also compare other models with it by simply adding other classifier to it. Let me show you by adding simple logistic regression to it.

Now let’s see the comparison of both model.

From the results Logistic regression is performing better as compare to Tree classifier.

Making Predictions

As a data scientist we’ve must know how to predict the data.So, let’s see how we can predict the data.

So let us make some test data for that.

Let’s make a simple test set having 5 instances.

Test set

Now let’s load this file through file widget.

Now let’s connect the prediction widget to logistic regression and further connect that widget to file widget.

Prediction

Now open the predictions widget.

Predictions from logistic regression

You can also see the probablities of predictions and in this case this are indeed higher.

So predictions are correct.

You can also connect multiple models to it.

Save Predictions

You can also download the prediction file with Save Data widget just add predictions widget to Save Data widget and select any format of file you want to save data as for e.g. .csv file, excel,etc.

Predictions

The predictions file also shows the probabilities of each predictions as well.

This is the entire cycle of any machine learning project with respect to orange.

This is just a simple introduction to getting started with orange we can do a lot of things with orange.

You can view their official youtube tutorial for exploring to know more information regarding orange.

I hope this article help you to get started with awesome orange tool for data science which adds a skill to your data science toolkit.

Thanks for reading this article and keep learning.😃

--

--