Machine Learning Classifications: An Overview

Machine Learning Classifications: An Overview

A Comprehensive Guide to Different Types of Machine Learning

ML is divided into four parts. They are :

  • Supervised Learning

  • Unsupervised Learning

  • Semi-Supervised Learning

  • Reinforcement Learning

Supervised Learning is subdivided internally.

  • Regression

  • Classification

Unsupervised is also subdivided internally.

  • Clustering

  • Dimensionality Reduction

  • Anomaly Detection

  • Association Rule Learning

Supervised Learning

If data has both input and output and training a model is a way to find out the relation between them and predict the unseen data, it is called Supervised Learning.

Example

IQCGPAPLACEMENT (Y/N)
807.5N
908Y
1117.9Y
706N

we have the above data of 5000 rows. we can easily say that "IQ" and "CGPA" are inputs and "PLACEMENT" is the output in the above table. In the above data, we have both input as well as output. The work of any ML algorithm is to draw a mathematical relationship between inputs and output. Now the ML algorithm can predict unseen data {100,9} with the help of the drawn relation. This is called Supervised Machine Learning.

Supervised ML has two parts Regression and Classification.

Before we go any further we need to know the two data types. The first is Numerical (ex: 1, 5 ). The second is Categorical. Categorical data is a type of data, that is used to group information of similar characteristics.

Regression

If you are applying a Supervised ML Algorithm to a dataset. If that dataset has a numerical output then that Supervised ML Algorithm applied on the dataset is called Regression.

Example:

IQCGPAPACKAGE(LPA)
807.58
90810
1117.915
7067.5

Here in the above dataset, the inputs are IQ and CGPA, and the output is PACKAGE. So, the Supervised ML Algorithm applied to the above dataset is called Regression.

Classification

On the contrary, if the output is Categorical then the Supervised ML Algorithm used is called Classification.

Example

IQCGPAPLACEMENT (Y/N)
807.5N
908Y
1117.9Y
706N

Here in the above dataset, the inputs are IQ and CGPA, and the output is PLACEMENT. So, the Supervised ML Algorithm applied to the above dataset is called Classification.

Unsupervised Learning

Differing from supervised, the dataset for unsupervised has only inputs and no output. The ML algorithm used for this dataset cannot predict tasks. Instead, it can do the following:

  1. Clustering

  2. Dimensionality Reduction

  3. Anomaly Detection

  4. Association Rule Learning

Clustering

Grouping data points with similar characteristics into separate groups is called Clustering. Let's understand it with an example.

Example:

Clustering basics and a demonstration in clustering infrastructure pathways  – Water Programming: A Collaborative Research Blog

In the figure above, if we look at the plot on the left, we can see that three groups can be formed. The clustering algorithm identifies the number of groups and creates a clustering region for each group.

Dimensionality Reduction

Dimensionality Reduction is a powerful technique. When you apply a Supervised ML Algorithm to a dataset, it may have thousands of input columns. Having so many input columns can slow down the ML algorithm because it has to process a large amount of data. After a certain point, adding more input columns doesn't improve the prediction. Dimensionality Reduction decreases the number of input columns. It is also useful when your data has many dimensions that can't be visualized by plotting. Dimensionality Reduction can reduce these to 3 dimensions, making them easier to visualize.

9. Dimensionality Reduction — Single-cell best practices

Anomaly Detection

It is a technique used to identify certain pieces of data that differ from the majority of data and don't fit the normal behavioral pattern. It is also called Outlier Detection.

Anomaly detection for cyber security via machine learning

Association Rule Learning

It is a detective in the data world. It finds surprising connections between things people buy, websites they visit, or the articles they read.

What is Association Rule Learning? Machine Learning Interview Questions

Semi-Supervised Learning

Now, from the name itself, you must have understood it is partially supervised and partially unsupervised. Creating labels for a dataset is called Labeling. It is a costly task because it requires manual work. The core idea behind semi-supervised learning is to label a small amount of data and then let the system automatically label the rest.

A Gentle Introduction to Semi Supervised Learning | by Gayatri Sharma |  Medium

Reinforcement Learning

Think of it as training a dog with rewards. An AI learns by trying actions and getting "treats" for good choices.

Conclusion

In conclusion, machine learning is divided into various methodologies tailored to different types of data and objectives. Supervised learning, with its regression and classification techniques, is ideal for datasets with clear input-output relationships. Unsupervised learning excels in discovering hidden patterns through clustering, dimensionality reduction, anomaly detection, and association rule learning. Semi-supervised learning strikes a balance by leveraging both labeled and unlabeled data, while reinforcement learning mimics the process of learning through rewards and penalties. Understanding these classifications and their applications is crucial for effectively leveraging machine learning to solve complex problems.

Did you find this article valuable?

Support Charan Tej by becoming a sponsor. Any amount is appreciated!