Data Mining Fundamentals

Harshit Jain
Analytics Vidhya
Published in
4 min readJul 8, 2020

Data Visualization
We are supposed to be doing Data Mining, why are we doing Data Visualization then? Yes, that’s because we have started it from very beginning, like from very basics and we need to know that mining is and how data visualization is important for that.

Data mining is a process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.
— Wikipedia

Data Visualization is a process of converting numerical data into graphical images like meaningful 3D pictures which will be used to analyze complex data easily.
educba.com

Data Visualization tells us how to create visualizations that effectively communicate the meaning behind data to an observer through visual perception.
We will learn how a computer displays information using computer graphics, and how the human perceives that information visually. We will also study the forms of data, including quantitative and non-quantitative data, and how they are properly mapped to the elements of a visualization to be perceived well by the observer. We will briefly overview some design elements for effective visualization, though we will not focus on the visual design needed to make attractive and artistic visualizations.

There can be two modes of visualization :
1. Interactive Visualization — Used for discovery and renders based on user input
2. Presentation Visualization — Used for communication, does not support input

Vector Graphics vs Raster Graphics
Vector Graphics are used to describe the shapes, and Raster Graphics are used to display the shapes.

Try SVG(Scalable Vector Graphics) here at — https://www.w3schools.com/graphics/svg_intro.asp

vector (on left) and raster (on right)

Rasterisation (or rasterization) is the task of taking an image described in a vector graphics format (shapes) and converting it into a raster image (a series of pixels, dots or lines, which, when displayed together, create the image which was represented via shapes).
— Wikipedia

Aliasing. (1) In computer graphics, the process by which smooth curves and other lines become jagged because the resolution of the graphics device or file is not high enough to represent a smooth curve.
— Wikipedia

Rasterization and aliasing

Photorealism is a genre of art that encompasses painting, drawing and other graphic media, in which an artist studies a photograph and then attempts to reproduce the image as realistically as possible in another medium.
— Wikipedia

Again that’s a general term, in our context all we wanna know about this is, photorealism is added to our 2D image to give our perceptual system visual cues about 3D spacial configuration about shapes that we have displayed on a 2D screen or surface.

Occlusion — One of the strongest cue, this generally means an object is opaque and covering something to give it a look like if it’s on top of it.

occlusion example

Illuminations helps to perceive the orientation of a surface.
These are of two types:
Diffuse Illuminations: surface brightest when facing a light source.
Specular Illuminations: surface brightest when reflecting a light source.

Shadowing: Shadows indicate light occlusion, cues the perceptual system to object positions relative to each other.

shadowing example

Perspective: Size constancy cues depth, objects are same size but father they get, smaller they look.

Motion parallax: things farther moves relatives slower than nearer to viewer perspective giving more like realistic view.

Stereopsis: rendering from two different view points(one for each eye), should be parallel and not be rotated.

stereopsis example

Non-photorealism
when we use 3D graphics for data visualization, we often do not want it to look like a photograph but an illustration, for this we use non-photorealistic rendering.

Example of non-photorealistic rendering

It’s not a photograph, but an illustration of heart from gray’s anatomy that focuses on describing more details through perception instead of showing a realistic effect.

Image on the left is with photorealistic rendering, and on right is with non-photorealistic rendering

Can be clearly seen that image on left focuses mainly on photorealistic rendering i.e. proper physics of lighting etc. but the image on right focuses more on psychology of perception.

I’ll keep adding more stuff to this story, it’s just the beginning of our journey.
For Image processing related stuff, you can refer to Fundamentals of Image processing.
and for
Data Analysis Series refer to weekly stories here.

--

--