Ilya Sutskever

Template:Infobox person

Ilya Sutskever (born 1985/86) is a Russian-born Canadian–Israeli computer scientist and artificial intelligence researcher. He is a co-founder and former chief scientist of OpenAI, and co-founder and chief scientist of Safe Superintelligence Inc. (SSI). He is widely regarded as one of the most influential figures in the development of modern deep learning and large language models.

Sutskever's research contributions include the AlexNet convolutional neural network (with Alex Krizhevsky and Geoffrey Hinton), which triggered the deep learning revolution in 2012, and foundational work on sequence-to-sequence learning that underpinned modern neural machine translation. At OpenAI, he was a driving force behind the research programme that produced the GPT-3 and GPT-4 language models.

Early life and education

Ilya Sutskever was born in Gorky (now Nizhny Novgorod), in the Russian Soviet Federative Socialist Republic. His family emigrated to Israel when he was a child, and he spent part of his youth in Jerusalem. He later moved to Canada for his university education.^[1]

Sutskever studied at the University of Toronto, earning his Bachelor of Science, Master of Science, and Doctor of Philosophy degrees in computer science. His doctoral research was supervised by Geoffrey Hinton, one of the pioneers of deep learning and a recipient of the 2018 Turing Award. During his doctoral work, Sutskever focused on training methods for recurrent neural networks and deep neural networks.^[2]

Research career

AlexNet (2012)

In 2012, Sutskever, together with Alex Krizhevsky and Geoffrey Hinton, developed AlexNet, a deep convolutional neural network that won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) by a wide margin. AlexNet achieved a top-5 error rate of 15.3%, compared to 26.2% for the second-place entry, demonstrating that deep neural networks trained on GPUs could dramatically outperform traditional computer vision methods.^[3]

The AlexNet result is widely considered a watershed moment in artificial intelligence. It demonstrated the practical viability of deep learning at scale and sparked a wave of investment and research that transformed computer vision, natural language processing, and the broader AI field. The original paper has been cited over 150,000 times, making it one of the most-cited works in computer science.

Sequence-to-sequence learning (2014)

In 2014, Sutskever, together with Oriol Vinyals and Quoc V. Le at Google, published a seminal paper on sequence-to-sequence learning using neural networks. The approach used two recurrent neural networks (an encoder and a decoder) to map variable-length input sequences to variable-length output sequences, achieving near state-of-the-art results on English-to-French machine translation.^[4]

This work laid the groundwork for the encoder–decoder architectures that would become central to neural machine translation and, ultimately, the Transformer architecture introduced in 2017. The sequence-to-sequence paradigm also influenced the design of generative language models.

Google Brain

After completing his PhD, Sutskever spent approximately two years at Google Brain, where he worked on deep learning research. During this period, he contributed to the sequence-to-sequence paper and other projects applying deep neural networks to challenging problems in language and vision.

OpenAI (2015–2024)

Founding

In December 2015, Sutskever was announced as a co-founder and chief scientist of OpenAI, a new artificial intelligence research laboratory. The organisation was established by Sam Altman, Elon Musk, Greg Brockman, Sutskever, and others, with the stated mission of ensuring that artificial general intelligence (AGI) would benefit all of humanity. OpenAI was initially structured as a non-profit, with pledges of over $1 billion in funding.^[5]

Sutskever's recruitment was considered a major coup for the new organisation. At the time, he was one of the most accomplished deep learning researchers in the world, and his decision to leave Google for OpenAI was taken as a signal of the new lab's seriousness and ambition.

Research leadership

As chief scientist, Sutskever oversaw OpenAI's core research direction. Under his guidance, OpenAI pursued a strategy of scaling up neural language models, a bet that proved transformative for the field. Key milestones during his tenure included:

GPT (2018): The first Generative Pre-trained Transformer, demonstrating the effectiveness of unsupervised pre-training followed by supervised fine-tuning.
GPT-2 (2019): A 1.5-billion-parameter language model whose capabilities raised concerns about potential misuse, leading OpenAI to initially withhold the full model.
GPT-3 (2020): A 175-billion-parameter model that demonstrated remarkable few-shot learning capabilities, transforming perceptions of what language models could achieve and catalysing the modern LLM industry.
GPT-4 (2023): A multimodal model representing a further significant leap in capability, though OpenAI declined to disclose architectural details.
ChatGPT (2022): A conversational interface to the GPT models, fine-tuned using reinforcement learning from human feedback (RLHF), which became the fastest-growing consumer application in history.

Sutskever was also a proponent of research into AI safety and alignment, often expressing concern about the long-term risks of increasingly capable AI systems. He reportedly led an internal OpenAI team focused on "superalignment" — the problem of ensuring that superintelligent AI systems remain aligned with human values.

November 2023 board crisis

On 17 November 2023, OpenAI's board of directors abruptly removed Sam Altman as CEO. Sutskever was reported to have been one of the board members involved in the decision, which was attributed to concerns that Altman had not been "consistently candid" with the board. The firing triggered a crisis within OpenAI: nearly all of the company's approximately 770 employees signed a letter threatening to resign and follow Altman to Microsoft unless the board reinstated him and resigned.^[6]

Within days, Sutskever publicly expressed regret over his role in the events, posting on social media that he "deeply regret[ted] my participation in the board's actions" and that he "never intended to harm OpenAI." Altman was reinstated as CEO on 21 November 2023 with a reconstituted board, from which Sutskever was removed.

The episode drew widespread attention to tensions within OpenAI between its commercial ambitions and its original safety-focused mission, and raised questions about the governance of powerful AI organisations.

Departure

In May 2024, Sutskever announced his departure from OpenAI. In a statement, he expressed confidence that OpenAI would "build AGI that is both safe and beneficial" under its current leadership. His departure followed the dissolution of the superalignment team he had co-led, and was widely interpreted as reflecting unresolved disagreements about the balance between safety research and product development at OpenAI.^[7]

Safe Superintelligence Inc. (2024–present)

In June 2024, Sutskever announced the founding of Safe Superintelligence Inc. (SSI), a new AI company focused exclusively on building safe superintelligent AI. The company was co-founded with Daniel Gross, a former partner at Y Combinator and head of AI at Apple, and Daniel Levy, a former OpenAI researcher.^[8]

SSI was structured as a for-profit company but with an unusual commitment: Sutskever stated that the company would focus entirely on the goal of safe superintelligence, without the distraction of products, revenue, or short-term commercial pressures. He described it as "one product, one focus, one goal."

In September 2024, SSI raised $1 billion in funding at a reported valuation of $5 billion, despite having no products and no revenue. Investors included Andreessen Horowitz, Sequoia Capital, and DST Global. The round underscored the extraordinary level of investor confidence in Sutskever's track record and vision.^[9]

SSI established offices in Palo Alto, California and Tel Aviv, Israel.

Recognition

Sutskever has been recognised as one of the most influential researchers in artificial intelligence:

Named to the MIT Technology Review "35 Innovators Under 35" list.
His papers have collectively received hundreds of thousands of citations, placing him among the most-cited researchers in computer science.
The AlexNet paper (2012) is one of the foundational works of the deep learning era.
He was a key figure in demonstrating the scaling laws that underpin modern large language models — the observation that model performance improves predictably with increases in data, compute, and parameters.

Views

Sutskever has been a consistent advocate for taking AI safety seriously, even as he has pushed the boundaries of AI capability. He has described the development of superintelligent AI as "inevitable" and has argued that the central challenge of the 21st century is ensuring that such systems are aligned with human values.

He has expressed scepticism about the sufficiency of current alignment techniques, including reinforcement learning from human feedback, for aligning superintelligent systems. At OpenAI, he argued for dedicating significant resources to superalignment research, and his departure was widely linked to frustration that commercial priorities were overtaking safety work.

In founding SSI, Sutskever articulated a vision in which safety and capability research are unified rather than in tension: "The safest way is to have the smartest AI on your side."

Selected publications

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). "ImageNet Classification with Deep Convolutional Neural Networks." NIPS 2012.
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). "Sequence to Sequence Learning with Neural Networks." NIPS 2014.
Sutskever, I., Martens, J., Dahl, G., & Hinton, G. (2013). "On the importance of initialization and momentum in deep learning." ICML 2013.

References

Template:Reflist

[profile-1] Template:Cite news

[thesis-2] Template:Cite thesis

[3] Template:Cite conference

[4] Template:Cite conference

[5] Template:Cite news

[6] Template:Cite news

[7] Template:Cite news

[8] Template:Cite news

[9] Template:Cite news

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]