Published in

Analytics Vidhya

# An Insight to Data Mining Algorithms

One of the most instructive lessons is that simple ideas often work very well, and I strongly recommend the adoption of a simplicity-first methodology when analyzing practical datasets.

There are many different kinds of simple structure that datasets can exhibit.

In one dataset, there might be a single attribute that does all the work and the others may be irrelevant or redundant.

# Inferring rudimentary rules

In any event, it is always a good plan to try the simplest things first.

The idea is this:

we make rules that test a single attribute and branch accordingly.

Each branch corresponds to a different value of the attribute.

It is obvious what is the best classification to give each branch: use the class that occurs most often in the training data.

# Missing values and numeric attributes

Although a very rudimentary learning method, 1R does accommodate both missing values and numeric attributes.

It deals with these in simple but effective ways.

Missing is treated as just another attribute value.

So that, for example,if the weather data had contained missing values for the outlook attribute, a rule set formed on outlook would specify four possible class values, one each for sunny, overcast, and rainy and a fourth for missing.

# Statistical modeling

The 1R method uses a single attribute as the basis for its decisions and chooses the one that works best.

Another simple technique is to use all attributes and allow them to make contributions to the decision that are equally important and independent of one another.

# Constructing decision trees

Decision tree algorithms are based on a divide-and-conquer approach to the classification problem.

They work from the top down, seeking at each stage an attribute to split on that best separates the classes; then recursively processing the sub-problems that result from the split.

# Conclusion

This strategy generates a decision tree, which can if necessary be converted into a set of classification rules — although if it is to produce effective rules, the conversion is not trivial.

If you like my work and like to support me …

.subscribe on my Youtube channel,I share lots of amazing content like

this but in video

--

--

## More from Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

## El Akioui Zouhaire

44 Followers

I am a software engineer and entrepreneur. My focus is on Developing technical skills,Learning marketing,and taking care of the health.