AI, machine learning, and deep learning: the nested circles.
Three terms get used interchangeably in headlines and they are not the same thing. This entry gives you a precise, durable mental model: artificial intelligence is the broad goal, machine learning is one family of methods for reaching it, and deep learning is one technique within that family. By the end you will be able to read any "AI" claim and place it correctly on this map — which tells you more about what it can and cannot do than any marketing copy.
The outer circle: artificial intelligence.
Artificial intelligence (AI) is the oldest and broadest of the three terms. It names a goal rather than a method: getting computers to perform tasks that, when humans do them, we call "intelligent" — recognising a face, understanding a sentence, planning a route, playing chess, diagnosing a fault. The field was named in 1956 and for most of its history the dominant approach had nothing to do with learning from data.
Early AI was largely symbolic — sometimes nicknamed "good old-fashioned AI." Engineers wrote explicit rules: IF temperature > 100 AND pressure rising THEN raise alarm. These systems, called expert systems, encoded human knowledge by hand. They worked well in narrow, well-understood domains and failed badly the moment reality did something the rule-writers had not anticipated. Writing rules for "recognise a cat in a photo" turned out to be effectively impossible — nobody can enumerate the rules that distinguish a cat from a small dog in arbitrary lighting.
The key idea to carry forward: AI is the umbrella. A pocket calculator, a chess engine from 1997, a spam filter, and a modern chatbot are all "AI" in the broad sense. The word alone tells you almost nothing about how a system works.
The middle circle: machine learning.
Machine learning (ML) is a subset of AI defined by how the system acquires its competence. Instead of a human writing the rules, the system infers the rules from examples. You show it thousands of photos labelled "cat" or "not cat," and an optimisation procedure adjusts the system's internal numbers until it predicts the labels well. Nobody ever wrote down "a cat has pointed ears" — the pattern was learned from data.
This is the pivotal conceptual shift in the whole field: programming by example instead of programming by instruction. It is why ML conquered problems symbolic AI could not — perception, language, recommendation — because for those problems we can supply examples even though we cannot articulate rules.
ML itself has several styles, worth knowing by name:
- Supervised learning — learn from labelled examples (input → desired output). Spam detection, image classification, price prediction.
- Unsupervised learning — find structure in unlabelled data: grouping similar customers, compressing data, detecting anomalies.
- Reinforcement learning — learn by trial and error against a reward signal. Game-playing agents and robot control; also a key ingredient in tuning modern chatbots.
A useful slogan: a machine-learning system is a function whose behaviour is determined by data plus an optimisation procedure, not by hand-written logic.
The inner circle: deep learning.
Deep learning is a subset of machine learning that uses artificial neural networks with many layers (the "deep" refers to the number of layers, not to profundity). A neural network is a particular flexible mathematical structure — loosely inspired by biological neurons — that turns inputs into outputs through many successive numerical transformations whose parameters are learned.
Deep learning is not new in principle — the core ideas date to the 1980s — but it became dominant around 2012 when three things lined up: very large labelled datasets, graphics processing units (GPUs) fast enough to train big networks, and refined training techniques. Since then, essentially every headline-grabbing AI result — image recognition surpassing humans, machine translation, protein-structure prediction, and large language models like the one generating this text — has been deep learning.
The trade-off that defines deep learning: it can learn extraordinarily complex patterns directly from raw data (pixels, characters, audio samples) with little hand-engineering, but it needs a lot of data and a lot of computation, and the resulting model is hard to interpret — you get a system that works without a tidy explanation of why.
Nesting check: every deep-learning system is a machine-learning system, and every machine-learning system is an AI system — but not the reverse. A hand-coded chess engine is AI but not ML. A spam filter using simple statistics is ML but not deep learning. A large language model is all three.
Why the distinction matters in practice.
The map is not academic trivia; it predicts behaviour. When you know a system is a deep-learning model trained on examples, you can immediately anticipate its characteristic properties: it generalises impressively to inputs similar to its training data, it degrades unpredictably on inputs unlike anything it was trained on, it cannot explain itself in human terms, and it has no built-in notion of "I don't know" unless that was specifically trained in. None of these are bugs to be patched away — they are direct consequences of learning patterns from data rather than following written rules.
Conversely, a rule-based system is predictable and auditable but brittle: it does exactly what it was told, including doing the wrong thing confidently when reality steps outside the rules. Many real products combine both — a learned model for perception wrapped in hand-written rules for safety.
So when you read "our product uses AI," the precise question to ask is: which circle? Hand-written rules, learned-from-data statistics, or a deep neural network? The answer tells you how it will fail, how much data shaped it, and how much you can trust it to explain itself — which is most of what you actually need to know.