Three Years of AI: Starting From ChatGPT

2026-06-14 · AI · Industry · 中文

Three Years of AI: Starting From ChatGPT

On November 30, 2022, OpenAI quietly put up a webpage called ChatGPT. Hardly anyone realized an era had begun that day. This note walks through these years in order — which concepts, which companies, which products emerged — to give you a map.

The start: late 2022, ChatGPT ignites

2022-11-30 ChatGPT (based on GPT-3.5) launches: 1M users in 5 days, 100M in 2 months — the fastest ever.
It didn't come from nowhere: Google's 2017 Transformer paper Attention is All You Need is the bedrock, and GPT-3 (2020) already amazed insiders. ChatGPT's breakthrough was "a usable chat interface + RLHF tuning."
Starting concepts: large language model (LLM), RLHF (reinforcement learning from human feedback), prompt.

2023: the war of a hundred models + the first agent fantasy

The big players all entered:

OpenAI: GPT-4 (March, stronger + multimodal), plugins and Code Interpreter, and GPTs + the GPT Store at year-end.
Microsoft: put GPT-4 into Bing and Office, played the Copilot card.
Google: scrambled out Bard, then consolidated into Gemini by year-end.
Anthropic: released Claude / Claude 2, leading on long context and safety.
Meta: after LLaMA leaked, open-sourced Llama 2, igniting the open-model ecosystem.

A flood of concepts: prompt engineering, RAG (retrieval-augmented generation), hallucination, context window, vector databases, fine-tuning / LoRA.

Tools and patterns: LangChain (chaining LLMs into apps), vector stores (Pinecone / Chroma), AutoGPT / BabyAGI (the first "autonomous agent" craze, mostly toys). Next door in images: Stable Diffusion (open source), Midjourney, DALL-E set off AI art.

2024: multimodal + reasoning models + agents get serious

Models kept racing:

OpenAI: GPT-4o (omni multimodal, voice / image), and o1 at year-end (a reasoning model that "thinks before answering").
Anthropic: the Claude 3 family (Haiku / Sonnet / Opus) through Claude 3.5 Sonnet, plus the debut of "computer use."
Google: Gemini 1.5, pushing the context window to the million-token range.
Meta: Llama 3 / 3.1 (the 405B open model).
New players: Mistral (France, open source), xAI Grok.

New concepts: multimodal, MoE (mixture of experts), reasoning models / test-time compute (trade more thinking for more accuracy), function calling / tool use, MCP (the tool-connection standard Anthropic open-sourced in November).

Agents went from toys to usable: Devin ("the first AI software engineer"), the rise of Cursor (the AI code editor), and GitHub Copilot's evolution; Perplexity (AI search), NotebookLM; Sora / Runway / Pika (AI video).

2025: the year of the agent + China enters + costs collapse

The DeepSeek moment: China's DeepSeek R1 (an open reasoning model) closed in on top closed models at a tiny fraction of the cost, shaking global markets and pushing "reasoning + open + cheap" to the front.
The model landscape: Anthropic Claude 4 family + Claude Code (an agentic coding tool in the terminal); OpenAI's o-series reasoning models kept iterating; Google Gemini 2.x got more agentic; China in full bloom — Qwen (Alibaba), DeepSeek, Kimi (Moonshot), GLM (Zhipu), Doubao (ByteDance), Ernie (Baidu).
There was only one theme word: Agent. People were no longer satisfied with chatting — they wanted AI that "gets work done on its own."
Concepts leveled up again: agentic AI, MCP as the de facto standard, A2A (agent-to-agent), context engineering / harness engineering, Skills, vibe coding (Karpathy's coined term: talk to the AI and software gets written).

One thread, in a diagram

2017     Transformer paper            (bedrock)
2020     GPT-3
2022.11  ChatGPT launches            <- the start
2023     GPT-4 / Claude / Llama open source / RAG / LangChain / AutoGPT
2024     GPT-4o / o1 reasoning / Claude 3.5 + computer use / MCP / Devin / Cursor
2025     DeepSeek R1 / Claude 4 + Claude Code / year of the agent / China / vibe coding

Players at a glance

Camp	Companies	Flagship products / models
Closed-source first tier	OpenAI	ChatGPT, GPT-4 / 4o, o-series, Sora
	Anthropic	Claude family, Claude Code, MCP
	Google	Gemini, NotebookLM
Giant-bound	Microsoft	Copilot (tied to OpenAI)
Open-source camp	Meta / Mistral / DeepSeek / Alibaba	Llama, Mistral, DeepSeek, Qwen
China	DeepSeek / Alibaba / Moonshot / Zhipu / ByteDance / Baidu	DeepSeek, Qwen, Kimi, GLM, Doubao, Ernie
Star startups	-	Cursor (coding), Devin (coding agent), Perplexity (search), Midjourney (art), Runway (video)

Three main lines

Models: from "bigger and stronger" to "reasoning + multimodal + cheaper."
Form: from a chat box, to tool calls, to autonomous agents.
Camps: a long tug-of-war between closed source (OpenAI / Anthropic / Google) and open source (Meta / Mistral / DeepSeek / Qwen).

The most notable shift: once model capability converges, competition moves outward — to the agent / harness / workflow layer. Whoever makes the same model "work more smoothly" wins — which is exactly what this site (Superpowers Skills) focuses on.

A closing note

This map is necessarily incomplete, and a few years from now it will surely need a lot of additions. But the main line is clear: we're moving from "AI that chats" to "AI that does the work." To go deeper along this line, see the site's "Concepts" section for keywords like Agent, MCP, Harness, agent loops, and harness engineering, then the "Core Superpowers Skills" section for how to actually drive a coding agent.