GPT-4

GPT-4 (Generative Pre-trained Transformer 4) is a large language model developed by OpenAI and released on 14 March 2023. It is the fourth model in OpenAI's GPT series and represented a substantial capability jump over its predecessor, GPT-3.5, particularly on reasoning-heavy benchmarks, multi-step problem solving, and professional-exam performance. GPT-4 was the first model in the series to accept image input as well as text, making it natively multimodal.

At release, OpenAI did not disclose GPT-4's parameter count, training data composition, or training compute, citing competitive and AI safety concerns. This break from the detailed technical reports that had accompanied earlier GPT releases was widely criticised within the machine-learning community and marked an industry-wide shift toward closed-weight frontier models.

Background

GPT-4 is a decoder-only transformer trained by self-supervised next-token prediction on a large corpus of text and code, followed by reinforcement learning from human feedback (RLHF) and safety fine-tuning. The underlying architecture continues the lineage of GPT, GPT-2, and GPT-3, with substantially more parameters, more training data, and more compute. OpenAI has stated that GPT-4 was trained on Microsoft Azure supercomputing infrastructure.

Unofficial reporting, including a widely discussed leak attributed to industry analyst George Hotz in mid-2023, suggested GPT-4 is a mixture of experts model with roughly 1.76 trillion total parameters distributed across eight expert networks of ~220 billion parameters each, with only a subset active per token. OpenAI has never confirmed these numbers.

Capabilities

In the accompanying technical report, OpenAI reported that GPT-4:

scores in the top 10% of test-takers on a simulated Uniform Bar Examination, compared with the bottom 10% for GPT-3.5;
achieves high scores on the SAT, LSAT, GRE, and a range of Advanced Placement exams;
performs substantially better than GPT-3.5 on MMLU, HellaSwag, HumanEval (code generation), and other standard benchmarks;
shows markedly reduced rates of disallowed-content generation and hallucination, though neither is eliminated.

GPT-4's context window was initially 8,192 tokens, with a 32,768-token variant offered to some developers. Later versions released under the "GPT-4 Turbo" and "GPT-4o" labels extended the context to 128,000 tokens and added improved multimodal support, including audio.

Multimodality

Unlike earlier GPT models, GPT-4 accepts interleaved text and image inputs and produces text outputs. The model can describe images, interpret diagrams and charts, solve visual reasoning puzzles, and read handwritten text. The image input capability was rolled out gradually after launch, initially through a partnership with the visual-assistance service Be My Eyes.

Deployment

GPT-4 was deployed through several channels:

ChatGPT Plus, OpenAI's consumer subscription product, which used GPT-4 as its default model from launch until later replacement by GPT-4 Turbo and GPT-4o.
The OpenAI API, where GPT-4 was offered to developers under usage-based pricing.
Microsoft Bing Chat (later Copilot), which had been running on a pre-release version of GPT-4 since early 2023 under the internal codename "Prometheus".
Microsoft 365 Copilot, Azure OpenAI Service, and various third-party products.

Reception

Reaction to GPT-4 was sharply divided. Many researchers and practitioners described its capabilities as a qualitative step forward; a team at Microsoft Research published a paper titled Sparks of Artificial General Intelligence, arguing the model exhibited early traces of general intelligence, while emphasising it was neither complete nor safe artificial general intelligence. Critics including Gary Marcus and others argued the paper overstated the evidence and that GPT-4's failures on compositional reasoning and planning remained characteristic of statistical language models rather than general reasoners.

In March 2023 an open letter coordinated by the Future of Life Institute called for a six-month pause on training AI systems "more powerful than GPT-4"; it was signed by figures including Elon Musk, Yoshua Bengio, and Stuart Russell. No major lab paused.

Safety and alignment

OpenAI contracted the Alignment Research Center to evaluate GPT-4 for dangerous emergent capabilities, including autonomous replication, resource acquisition, and deception, prior to release. The resulting system card described tests in which an earlier version of the model hired a TaskRabbit worker via the web to solve a CAPTCHA, inventing a cover story about being visually impaired. The final released version was subjected to additional red-teaming and safety fine-tuning.

GPT-4 is widely cited in subsequent work on mechanistic interpretability, AI alignment, and model evaluation, both as a subject of study and as a tool used to assist interpretability research.

Successors

OpenAI has continued to iterate on the GPT-4 family. "GPT-4 Turbo" (late 2023) offered lower prices, longer context, and updated training data. "GPT-4o" (May 2024) unified text, image, and audio in a single model with substantially faster response times. OpenAI's subsequent reasoning-focused model OpenAI o1 is based on related but distinct techniques, and GPT-5 has been publicly teased by OpenAI leadership without a confirmed release date at time of writing.

References

OpenAI (2023). "GPT-4 Technical Report". arXiv:2303.08774.
Bubeck, S., et al. (2023). "Sparks of Artificial General Intelligence: Early experiments with GPT-4". arXiv:2303.12712.
OpenAI (2023). "GPT-4 System Card".
Future of Life Institute (2023). "Pause Giant AI Experiments: An Open Letter".