What is Machine Learning ?

Overview and Basic Terms

3 min readFeb 13, 2025

--

Whats is Machine Learning ?

Machine learning (ML) enables computers to learn from data rather than being explicitly programmed.
It has become mainstream with applications such as spam filters, voice recognition, recommendation systems, and self-driving cars.
A machine learns if its performance on a task improves with experience, as defined by Tom Mitchell (1997).

Why Use Machine Learning?

Traditional rule-based programming is limited because it requires hardcoded rules that are difficult to maintain.
ML is useful when:
The problem is too complex for traditional programming.
Rules change frequently (e.g., spam detection adapting to new tricks).
Hidden patterns in data can be discovered (data mining).

Types of Machine Learning Systems

Supervised Learning

Uses labeled data (input-output pairs).
Examples:
Classification (e.g., spam detection).
Regression (e.g., predicting house prices).

2. Unsupervised Learning

Uses unlabeled data; the model finds patterns without predefined categories.
Examples:
Clustering (e.g., customer segmentation).
Anomaly detection (e.g., fraud detection).
Dimensionality reduction (e.g., PCA for visualization).

3. Semi-supervised Learning

Uses a mix of labeled and unlabeled data (e.g., Google Photos face recognition).

4. Self-supervised Learning

A special case of unsupervised learning where a model generates its own labels (e.g., language models like GPT).

5. Reinforcement Learning

Uses agents that learn by interacting with an environment and receiving rewards or penalties (e.g., AlphaGo, robotic control).

Batch vs. Online Learning

Batch Learning: The model is trained on a full dataset and deployed statically.
Online Learning: The model learns incrementally from a stream of data, making it adaptable to changes.

Instance-Based vs. Model-Based Learning

Instance-Based Learning: The system memorizes examples and classifies new instances based on similarity (e.g., k-nearest neighbors).
Model-Based Learning: The system creates a mathematical model to generalize from training data (e.g., linear regression).

Challenges in Machine Learning

Insufficient Training Data: More data generally leads to better models, especially in deep learning.
Nonrepresentative Data: The training data must reflect the real-world scenarios the model will encounter.
Poor-Quality Data: Noisy or incorrect labels can mislead the model.
Irrelevant Features: Feature engineering is critical to improving performance.

Overfitting vs. Underfitting (Basic Terms)

Overfitting: The model learns noise instead of patterns, performing well on training data but poorly on new data.
Underfitting: The model is too simple to capture patterns in the data.
Solutions: Use more data, simplify the model, or apply regularization techniques.

Machine Learning

Written by Ananya

Data Scientist || MS Computer Science Kaggle : https://www.kaggle.com/an1005 Data Viz :https://public.tableau.com/app/profile/ananya2311/vizzes

No responses yet

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams