What is Machine Learning ?

Overview and Basic Terms

Ananya
3 min readFeb 13, 2025

Whats is Machine Learning ?

  • Machine learning (ML) enables computers to learn from data rather than being explicitly programmed.
  • It has become mainstream with applications such as spam filters, voice recognition, recommendation systems, and self-driving cars.
  • A machine learns if its performance on a task improves with experience, as defined by Tom Mitchell (1997).

Why Use Machine Learning?

  • Traditional rule-based programming is limited because it requires hardcoded rules that are difficult to maintain.
  • ML is useful when:
  • The problem is too complex for traditional programming.
  • Rules change frequently (e.g., spam detection adapting to new tricks).
  • Hidden patterns in data can be discovered (data mining).

Types of Machine Learning Systems

  1. Supervised Learning
  • Uses labeled data (input-output pairs).
  • Examples:
  • Classification (e.g., spam detection).
  • Regression (e.g., predicting house prices).

2. Unsupervised Learning

  • Uses unlabeled data; the model finds patterns without predefined categories.
  • Examples:
  • Clustering (e.g., customer segmentation).
  • Anomaly detection (e.g., fraud detection).
  • Dimensionality reduction (e.g., PCA for visualization).

3. Semi-supervised Learning

  • Uses a mix of labeled and unlabeled data (e.g., Google Photos face recognition).

4. Self-supervised Learning

  • A special case of unsupervised learning where a model generates its own labels (e.g., language models like GPT).

5. Reinforcement Learning

  • Uses agents that learn by interacting with an environment and receiving rewards or penalties (e.g., AlphaGo, robotic control).

Batch vs. Online Learning

  • Batch Learning: The model is trained on a full dataset and deployed statically.
  • Online Learning: The model learns incrementally from a stream of data, making it adaptable to changes.

Instance-Based vs. Model-Based Learning

  • Instance-Based Learning: The system memorizes examples and classifies new instances based on similarity (e.g., k-nearest neighbors).
  • Model-Based Learning: The system creates a mathematical model to generalize from training data (e.g., linear regression).

Challenges in Machine Learning

  1. Insufficient Training Data: More data generally leads to better models, especially in deep learning.
  2. Nonrepresentative Data: The training data must reflect the real-world scenarios the model will encounter.
  3. Poor-Quality Data: Noisy or incorrect labels can mislead the model.
  4. Irrelevant Features: Feature engineering is critical to improving performance.

Overfitting vs. Underfitting (Basic Terms)

  • Overfitting: The model learns noise instead of patterns, performing well on training data but poorly on new data.
  • Underfitting: The model is too simple to capture patterns in the data.
  • Solutions: Use more data, simplify the model, or apply regularization techniques.

--

--

No responses yet