The AI Jargon Glossary: 40 Terms Explained

Every field has its vocabulary, and AI’s is growing fast. Tokens, embeddings, fine-tuning, RAG, agents, hallucinations — the jargon piles up quickly, and it can make simple ideas sound far more intimidating than they are. This AI jargon glossary cuts through that noise.

Below you’ll find 40 of the most common AI terms, each defined in plain English with a quick note on why it matters. No equations, no hand-waving. You can read it start to finish to build a foundation, or bookmark it and look up terms as you bump into them. Either way, the goal is the same: when you next see one of these words in an article, a settings menu, or a sales pitch, you’ll know exactly what it means.

The terms are grouped loosely by theme so related ideas sit together. Don’t feel you need to absorb all of it at once; even reading the first two sections gives you enough to follow most AI conversations. Let’s demystify the vocabulary.

The core concepts

These are the foundational terms. Get these and most of the rest fall into place.

Artificial Intelligence (AI)

The broad field of building software that performs tasks we associate with human intelligence — understanding language, recognizing images, making decisions. In everyday 2026 usage, “AI” usually means chat assistants powered by large language models. The term has been around since the 1950s, but the recent wave of capable chat tools is what brought it into daily conversation.

Machine Learning (ML)

A subset of AI where programs learn patterns from data instead of being given explicit hand-written rules. You show the system many examples, and it figures out the patterns. A spam filter that learns from millions of emails which messages are junk is a classic example. Almost all modern AI is machine learning.

Large Language Model (LLM)

A type of AI trained on enormous amounts of text to understand and generate language. ChatGPT, Claude, and Gemini are all built on LLMs. The “large” refers to both the volume of training data and the number of internal parameters. At their core, LLMs work by predicting the next chunk of text given everything before it — a simple idea that, at huge scale, produces surprisingly capable writing, reasoning, and answering. It’s the engine behind today’s chat assistants. For a full walkthrough, see how large language models work.

Generative AI

AI that creates new content — text, images, audio, video, or code — rather than just classifying or analyzing existing data. The chat and image tools you use are generative AI. This is the category that exploded into mainstream use, as distinct from older “predictive” AI that mostly sorted or scored existing things (like a recommendation engine or a fraud detector).

Model

The actual trained AI system that takes your input and produces an output. “Model” refers to the specific brain you’re talking to (like a particular version of GPT or Claude), as opposed to the app or interface wrapped around it. One product often offers several models — a fast, cheap one and a slower, smarter one — and which you pick affects quality, speed, and cost.

Neural network

The underlying structure most modern AI is built on, loosely inspired by how brain neurons connect. It’s a web of interconnected nodes that pass signals along, adjusting their connections during training. You don’t need the internals — just know it’s the architecture under the hood.

Deep learning

Machine learning using neural networks with many layers (“deep” networks). It’s what made the recent leap in AI capability possible and underpins virtually all the tools you’ll use.

How models read and respond

This cluster explains the mechanics of a single AI interaction.

Token

The small chunk of text a model reads and writes in — often a word, sometimes part of a word or punctuation. Models process everything as tokens, and usage and limits are usually measured in them. Roughly, a token is about three-quarters of a word in English, so a 1,000-word document is somewhere around 1,300 tokens. When a tool talks about pricing or limits “per token,” this is what it means.

Prompt

The input you give an AI — your instructions, question, and context. Writing good prompts is the main skill for getting good results. Our prompt writing basics guide covers how to do it well.

Context window

The maximum amount of text (measured in tokens) a model can consider at once, including your prompt and its own response. Think of it as the model’s short-term memory. Exceed it and earlier parts of the conversation start to drop out of view, which is why a very long chat can make the AI seem to “forget” something you said near the start. Modern models have steadily larger windows, but the limit never fully disappears.

Inference

The act of running a trained model to generate an output — what happens every time you send a prompt and get an answer. Distinct from training, which is how the model was built in the first place.

Temperature

A setting that controls how random or creative a model’s output is. Low temperature gives focused, predictable answers — good for factual or technical tasks. Higher temperature gives more varied, surprising ones — better for brainstorming or creative writing. You’ll meet this dial mostly in APIs and advanced settings, not in the basic chat interface, but it explains why the same prompt can produce different answers each time.

Hallucination

When an AI confidently states something false — an invented fact, a fake citation, a made-up statistic. It’s a fundamental side effect of how models generate plausible-sounding text rather than retrieving verified facts; the model has no built-in sense of true versus false. It isn’t lying, exactly — it’s filling in a likely-looking answer. Always verify anything that matters, especially names, numbers, dates, and sources.

Prompt engineering

The practice of crafting and refining prompts to get better, more reliable results from AI. Less mysterious than it sounds — it’s mostly clear instructions, good context, and useful examples.

System prompt

A behind-the-scenes instruction that sets a model’s overall behavior, role, and rules before your conversation starts. It’s why a customer-service bot stays on-topic and a coding assistant defaults to writing code.

A stylized illustration of text being broken into small token blocks and fed into a model

Training and building models

How models come to exist and get specialized.

Training

The process of building a model by feeding it data so it learns patterns. It’s computationally expensive and happens before you ever use the model. The result is a finished model ready for inference.

Pretraining

The first, largest training stage, where a model learns general language patterns from a vast, broad dataset. This is where most of a model’s raw knowledge comes from.

Fine-tuning

Additional training on a smaller, focused dataset to specialize a model for a particular task, domain, or style. It’s how a general model becomes good at, say, medical text or a company’s specific tone. Fine-tuning is more involved than prompting, so most everyday users never need it — clear instructions usually get the job done.

Parameters

The internal values a model adjusts during training to capture patterns — often counted in the billions. More parameters can mean more capability, but bigger isn’t automatically better. See what AI parameters are for the full picture.

Training data

The collection of text, images, or other content a model learns from. Its quality, breadth, and biases directly shape what the model knows and how it behaves.

Training cutoff

The date after which a model has no built-in knowledge, because its training data stops there. It’s why a model may not know about recent events unless it can search the web live.

Alignment

The work of making a model’s behavior match human intentions and values — helpful, honest, and harmless. Includes techniques to reduce harmful, biased, or untruthful outputs, and to make a model decline genuinely dangerous requests. Alignment is a major focus of AI safety research, since a capable model that behaves unpredictably is far less useful and far riskier.

RLHF (Reinforcement Learning from Human Feedback)

A training method where humans rate model outputs, and those ratings teach the model to produce more helpful, appropriate responses. A major reason modern assistants feel cooperative and well-mannered.

Open-weight model

A model whose trained parameters are released publicly, so anyone can download, run, and adapt it on their own hardware. Contrasts with closed models accessed only through a company’s service. Open-weight models offer more control and privacy but require more setup; closed models are easier to use but you depend on the provider. See open vs closed AI models for the trade-offs.

Small language model (SLM)

A more compact language model that’s faster, cheaper, and able to run on modest hardware — sometimes even your own phone or laptop — while being good enough for many tasks. Not every job needs a giant frontier model; for routine classification, summarizing, or simple drafting, a small model can be the smarter, more private, and far cheaper choice.

Capabilities and behaviors

Terms describing what models can do and how they act.

Multimodal

Able to work with more than one type of input or output — text plus images, audio, or video. A multimodal model can describe a photo you upload, read a chart, transcribe a voice note, or generate an image from a description. Most leading assistants are now multimodal to some degree, which is why you can drop a screenshot into a chat and ask about it.

Reasoning model

A model designed to “think” through problems step by step before answering, often spending more time (and computing power) on hard questions. Useful for math, logic, coding, and complex multi-step tasks where a quick guess won’t do. The trade-off is speed: reasoning models are slower and pricier, so they’re overkill for simple requests like rewriting an email.

Chain-of-thought

A prompting and reasoning technique where the model works through a problem in intermediate steps rather than jumping straight to an answer. It tends to improve accuracy on complex tasks.

Zero-shot / few-shot

Ways of prompting. Zero-shot means asking the model to do a task with no examples; few-shot means including a handful of examples to show it what you want. Few-shot often improves results noticeably.

Embedding

A way of turning text (or images) into a list of numbers that captures meaning, so that similar things end up close together numerically. “King” and “queen” would sit near each other; “king” and “bicycle” would be far apart. Embeddings power semantic search, recommendations, and matching by meaning rather than exact words — and they’re a key ingredient in RAG systems.

Fine-tune vs. prompt

A common practical distinction: you can either fine-tune a model (retrain it on your data) or simply prompt it well (give it instructions and context). For most users, smart prompting handles the job without any training.

Connecting AI to the world

How models reach beyond a single chat box.

API (Application Programming Interface)

A standard way for software to talk to other software, like a waiter taking your order to the kitchen and bringing back the dish. AI APIs let developers plug a model into their own apps and workflows, so a feature in some app you use might quietly be powered by an AI model behind the scenes. Our APIs explained simply guide breaks it down without code.

API key

A secret code that identifies and authorizes you when using an API — like a password for software access. Keep it private; anyone with your key can use (and bill) your account.

RAG (Retrieval-Augmented Generation)

A technique where the AI first retrieves relevant information from a trusted source — your documents, a database, the web — and then generates an answer using it. This grounds responses in real data and cuts down on hallucinations. It’s the engine behind “chat with your documents” features and AI search tools that cite their sources.

Fine-grained access / connector

A bridge that lets an AI assistant securely reach into your apps and data to read or act. Connectors are how an assistant can, say, pull from your calendar or update a document.

MCP (Model Context Protocol)

An open standard, often described as “USB-C for AI,” that gives models a consistent way to plug into tools and data sources. Before standards like this, every connection between an AI and an app had to be built from scratch; MCP lets one assistant connect to many systems through a shared interface. See our MCP explained guide for more.

Webhook

An automated message one app sends another the moment something happens — a way to trigger an action in real time. Common in automations that respond instantly to events.

AI agent

An AI system that doesn’t just answer but takes actions to accomplish a goal — using tools, making decisions, and working through multiple steps with some autonomy. For example, instead of telling you how to book a meeting, an agent might check your calendar, find a slot, and send the invite. Distinct from a plain chatbot, which only talks. “Agentic” is the adjective you’ll see for systems that work this way.

Chatbot

A conversational AI that responds to messages but doesn’t take independent action in the world. The difference between a chatbot and an agent is whether it can do things, not just say things.

Token limit / rate limit

Caps on how much you can use a model — token limits restrict how much text fits in one request, while rate limits restrict how many requests you can make in a period. You’ll hit these mostly on free tiers and APis.

How to use this glossary

You don’t need to memorize all 40 terms. The handful that come up constantly — model, token, prompt, context window, hallucination, and agent — are worth knowing cold, because almost every AI conversation touches them. The rest you can look up as needed, and they’ll stick faster once you’ve seen them in context.

A useful pattern: when you meet an unfamiliar AI term in the wild, find it here, then immediately use it once in a sentence of your own. “The model hit its context window, so it forgot the start of our chat.” Using a word is how it moves from a definition you read to vocabulary you own.

Notice, too, how interconnected these terms are. Tokens explain context windows. Training explains the cutoff and hallucinations. APIs, keys, connectors, and MCP all describe ways of plugging models into your tools. Once you see the connections, the jargon stops being a pile of buzzwords and becomes a coherent map of how AI actually works.

If you want a rough priority order, here’s how the terms tend to matter in practice. Tier one — model, token, prompt, context window, hallucination, agent — comes up in nearly every AI conversation. Tier two — training, fine-tuning, multimodal, API, RAG, parameters — shows up when you read about how tools are built or compared. Tier three — embeddings, RLHF, chain-of-thought, system prompt — is mostly for when you want to go a level deeper or work more technically. Learn them outward from tier one and you’ll never feel lost, even as new buzzwords arrive.

One more habit worth building: when a vendor or article uses a term to impress rather than inform, you’ll now be able to tell. A lot of AI marketing leans on jargon to sound advanced. Knowing what the words actually mean lets you ask the only question that counts — “okay, but what does it let me do?” — and judge the answer on its merits.

The takeaway

The vocabulary of AI sounds more intimidating than the ideas behind it. A model predicts text from patterns it learned in training; it reads in tokens; it remembers only what fits in its context window; it sometimes makes things up; and it can be connected to your tools through APIs and connectors. Nearly every term in this glossary is a detail hanging off one of those simple truths.

Keep this page handy. Next time a headline or settings screen throws a term at you, you’ll have a plain-English definition a click away, and the jargon will lose its power to make AI feel harder than it is.

To go deeper on the foundations, start with our beginner’s guide to AI. And for a calm, occasional roundup of what’s worth knowing, Join the Internet 101 newsletter.