Three Years of AI: Starting From ChatGPT
Three Years of AI: Starting From ChatGPT
On November 30, 2022, OpenAI quietly put up a webpage called ChatGPT. Hardly anyone realized an era had begun that day. This note walks through these years in order — which concepts, which companies, which products emerged — to give you a map.
The start: late 2022, ChatGPT ignites
- 2022-11-30 ChatGPT (based on GPT-3.5) launches: 1M users in 5 days, 100M in 2 months — the fastest ever.
- It didn't come from nowhere: Google's 2017 Transformer paper Attention is All You Need is the bedrock, and GPT-3 (2020) already amazed insiders. ChatGPT's breakthrough was "a usable chat interface + RLHF tuning."
- Starting concepts: large language model (LLM), RLHF (reinforcement learning from human feedback), prompt.
2023: the war of a hundred models + the first agent fantasy
The big players all entered:
- OpenAI: GPT-4 (March, stronger + multimodal), plugins and Code Interpreter, and GPTs + the GPT Store at year-end.
- Microsoft: put GPT-4 into Bing and Office, played the Copilot card.
- Google: scrambled out Bard, then consolidated into Gemini by year-end.
- Anthropic: released Claude / Claude 2, leading on long context and safety.
- Meta: after LLaMA leaked, open-sourced Llama 2, igniting the open-model ecosystem.
A flood of concepts: prompt engineering, RAG (retrieval-augmented generation), hallucination, context window, vector databases, fine-tuning / LoRA.
Tools and patterns: LangChain (chaining LLMs into apps), vector stores (Pinecone / Chroma), AutoGPT / BabyAGI (the first "autonomous agent" craze, mostly toys). Next door in images: Stable Diffusion (open source), Midjourney, DALL-E set off AI art.
2024: multimodal + reasoning models + agents get serious
Models kept racing:
- OpenAI: GPT-4o (omni multimodal, voice / image), and o1 at year-end (a reasoning model that "thinks before answering").
- Anthropic: the Claude 3 family (Haiku / Sonnet / Opus) through Claude 3.5 Sonnet, plus the debut of "computer use."
- Google: Gemini 1.5, pushing the context window to the million-token range.
- Meta: Llama 3 / 3.1 (the 405B open model).
- New players: Mistral (France, open source), xAI Grok.
New concepts: multimodal, MoE (mixture of experts), reasoning models / test-time compute (trade more thinking for more accuracy), function calling / tool use, MCP (the tool-connection standard Anthropic open-sourced in November).
Agents went from toys to usable: Devin ("the first AI software engineer"), the rise of Cursor (the AI code editor), and GitHub Copilot's evolution; Perplexity (AI search), NotebookLM; Sora / Runway / Pika (AI video).
2025: the year of the agent + China enters + costs collapse
- The DeepSeek moment: China's DeepSeek R1 (an open reasoning model) closed in on top closed models at a tiny fraction of the cost, shaking global markets and pushing "reasoning + open + cheap" to the front.
- The model landscape: Anthropic Claude 4 family + Claude Code (an agentic coding tool in the terminal); OpenAI's o-series reasoning models kept iterating; Google Gemini 2.x got more agentic; China in full bloom — Qwen (Alibaba), DeepSeek, Kimi (Moonshot), GLM (Zhipu), Doubao (ByteDance), Ernie (Baidu).
- There was only one theme word: Agent. People were no longer satisfied with chatting — they wanted AI that "gets work done on its own."
- Concepts leveled up again: agentic AI, MCP as the de facto standard, A2A (agent-to-agent), context engineering / harness engineering, Skills, vibe coding (Karpathy's coined term: talk to the AI and software gets written).
One thread, in a diagram
2017 Transformer paper (bedrock)
2020 GPT-3
2022.11 ChatGPT launches <- the start
2023 GPT-4 / Claude / Llama open source / RAG / LangChain / AutoGPT
2024 GPT-4o / o1 reasoning / Claude 3.5 + computer use / MCP / Devin / Cursor
2025 DeepSeek R1 / Claude 4 + Claude Code / year of the agent / China / vibe coding
Players at a glance
| Camp | Companies | Flagship products / models |
|---|---|---|
| Closed-source first tier | OpenAI | ChatGPT, GPT-4 / 4o, o-series, Sora |
| Anthropic | Claude family, Claude Code, MCP | |
| Gemini, NotebookLM | ||
| Giant-bound | Microsoft | Copilot (tied to OpenAI) |
| Open-source camp | Meta / Mistral / DeepSeek / Alibaba | Llama, Mistral, DeepSeek, Qwen |
| China | DeepSeek / Alibaba / Moonshot / Zhipu / ByteDance / Baidu | DeepSeek, Qwen, Kimi, GLM, Doubao, Ernie |
| Star startups | - | Cursor (coding), Devin (coding agent), Perplexity (search), Midjourney (art), Runway (video) |
Three main lines
- Models: from "bigger and stronger" to "reasoning + multimodal + cheaper."
- Form: from a chat box, to tool calls, to autonomous agents.
- Camps: a long tug-of-war between closed source (OpenAI / Anthropic / Google) and open source (Meta / Mistral / DeepSeek / Qwen).
The most notable shift: once model capability converges, competition moves outward — to the agent / harness / workflow layer. Whoever makes the same model "work more smoothly" wins — which is exactly what this site (superpowers and skills) focuses on.
A closing note
This map is necessarily incomplete, and a few years from now it will surely need a lot of additions. But the main line is clear: we're moving from "AI that chats" to "AI that does the work." To go deeper along this line, see the site's "Concepts" section for keywords like Agent, MCP, Harness, agent loops, and harness engineering, then the "superpowers Core Skills" for how to actually drive a coding agent.