Machine Learning: The First Chapter

Machine Learning: The First Chapter

The term Machine Learning first emerged in the 1950s when Arthur Samuel an AI pioneer developed a self-learning checkers playing system. He observed that as the system played more, it got better and better. A new era of Machine Learning has begun, driven by the advances in statistics and computer science, as well as the development of better neural networks and the availability of datasets. Whether you realize it or not, machine learning is everywhere in the 21st century. An intelligent home assistant like Google Home, a fitness tracker like FitBit, or a smart programming assistant like Github Copilot, etc are a few examples.

Arthur Samuel defined machine learning as a “field of study that gives computers the ability to learn without being explicitly programmed”. The goal of machine learning is to mimic how humans learn by using data and algorithms, improving the accuracy over iterations. Machine Learning generally aims to understand the structure of data and fit the data into intuitive and usable models.

Types of Machine Learning

Based on how learning is received by the system being developed and how feedback on learning is given, there are different types of machine learning models and algorithms. These models can be broadly classified into the following categories:

Supervised Learning

The process of supervised learning involves using labeled datasets to train algorithms to make data classification and prediction accuracy. The weights of the model are adjusted as input data is fed into the model. This method allows the algorithm to “learn” by checking its actual output against its “taught” output, then modifying the model accordingly to fix any errors that have been detected.

For example, in order to detect spam automatically, you would feed a machine learning algorithm examples of emails that are spam and others that should not be considered spam. The use of historical data for predicting future events is a common application of supervised learning. Supervised learning can be further divided into:

  • Regression: It involves mapping input variables to some continuous function and predicting results within a continuous output. The output of a regression problem is usually a real or continuous value, such as “price”, “height” or “temperature”.
  • Classification: A classification algorithm maps input variables into discrete categories. Classification either predicts categorical class labels or classifies data (constructs a model) based on the training set and the values (class labels) in classifying attributes and uses it in classifying new data.

Classification-vs-Regression (1).png

Unsupervised Learning

The goal of unsupervised learning algorithms is to uncover relationships and insights in unlabeled data. Here, models are fed data but are not provided any guidance related to the desired outcomes, so they have to make assumptions using circumstantial evidence. It doesn’t require human intervention to find hidden patterns or data groupings.

Without being told the “right” answer, unsupervised learning methods can help organize data in potentially meaningful ways and find insights. Unsupervised learning can also be further classified into:

  • Clustering: It is the process of grouping objects based on the characteristics they share. Its main purpose is to calculate how similar objects or entities are to each other. Grouping customers by their purchasing behavior, grouping movies based on their genre, etc are a few examples.
  • Association: The goal of association is to find rules that describe an association between large chunks of data, for example when amazon recommends you to buy a mouse after you buy a laptop.

Reinforcement Learning

A reinforcement learning algorithm is a behavioral machine learning model that’s similar to supervised learning but instead of being trained using sample data, it learns by trial and error. A sequence of successful outcomes is reinforced to develop the best policy for a given problem.

The algorithm or agent (which is basically a software program) learns through interacting with its environment where correct responses are rewarded and incorrect responses are penalized. Using rewards and penalties the algorithm learns without human intervention, with the goal of maximizing rewards and minimizing penalties.

AGENT.png

ML Essentials

UC Berkeley breaks out the learning system of a machine learning algorithm into three main parts.

  1. A Decision Process: In general, machine learning algorithms are used to make a prediction or classification. Based on some input data, which can be labeled or unlabeled, your algorithm will determine an estimate of a pattern in the data.
  2. An Error Function: An error function serves to evaluate the prediction of the model. If there are known examples, an error function can make a comparison to assess the accuracy of the model.
  3. A Model Optimization Process: If the model can fit better to the data points in the training set, then weights are adjusted to reduce the discrepancy between the known example and the model estimate. The algorithm will repeat this evaluation and optimize the process, updating weights autonomously until a threshold of accuracy has been met.

Conclusion

This blog was meant to give you a bird’s eye view of ML and introduce you to a few of its ideas and methods. I’ll be posting detailed articles about each of these methods soon. Happy learning!🤩

Did you find this article valuable?

Support Siddhant Pandey by becoming a sponsor. Any amount is appreciated!