Data Science Series Lab-3: Getting Started with Orange Tool

Rushi Patel
3 min readSep 11, 2021

--

Orange Tool (credit)

What is Orange Tool?

Orange is an open-source data visualization, machine learning, and data mining toolset. It has a visual programming interface for exploratory data analysis and interactive data visualization, as well as the option to use it as a Python library. Orange is a component-based software suite that specialized in machine learning and data mining, with a focus on visualization. The components that span everything from visualization to pre-processing, evaluation, and predictive modelling are known as widgets.

Interactive data exploration for rapid qualitative analysis with clean visualizations. Graphic user interface allows you to focus on exploratory data analysis instead of coding, while clever defaults make fast prototyping of a data analysis workflow extremely easy. Place widgets on the canvas, connect them, load your datasets and harvest the insight!

A user can do a lot more on this tool such as just display the dataset with into a tabular form, into a graphical form (i.e scatter, distribution, bar-chart, and so on.) and perform different features on that specific dataset with just simple drag and drop function. Furthermore, you can not only analyse your data-info, filter out unnecessary rows or columns, but also compare and contrast learning algorithms and predictors.

How to Use Workflows in Orange?

Orange Workflows are made up of components that read, analyse, and visualize data. Orange widgets are building blocks for data analysis workflows that are assembled in Orange’s visual programming environment. Widgets communicate by sending data and a communication channel. The output of one widget is used as the input for another. As a result, a workflow is created.

So, let’s create one workflow for learning purpose for any dataset. For this you can either use or take in-built datasets which are already present in orange tool itself or simply can import dataset from local machine or by using URL.

  1. Start orange tool. On left panel there is file feature beneath the data section. So just drag that ‘file’ into the right side. On double clicking, it open such window as follow.
Fig-1: All widget

After, I have used file widget in that one can use in-built dataset or can import externally but, I have used IRIS Dataset for demo purpose.

Fig-2: File widget Functionality

2. Creating Flow

After importing the dataset with the File widget, we can create a flow between File widget -Data Info widget by dragging a line between them which simply creates a channel in-between.

The Data Info widget provides the information about the data that has been loaded. It also shows the dataset’s row count, column count, and targets, as well as the dataset’s name, size, features, description, row count, column count, targets, and data characteristics.

Fig-3: Workflow between File and Data-Info

3. Graphical Representation

Similarly we can create flows between different widget of our use. Further, I’m using scatter widget for showing you graphical representation of IRIS dataset. As shown below, it also provides the functionality of regression line by just selecting it in ‘scatter plot panel’.

Fig-4: Scatter plot

Similarly, we can use different widget functionality for different tasks according to requirement such as summary of datasets, tabular representation of datas’, distributions and many more. Practical use and information regrading such widgets will be provided in upcoming blog series. So this was the basic overview of a orange tool.

See you in next blog.

Keep reading ;))

--

--