AI is Nothing New, Here's the Full History

Rewrite of the first chapter of Mostly Harmless AI with lots of updates

Aug 10, 2025

The following is a second draft of the zero-th chapter of my upcoming book Mostly Harmless AI. In this second draft we significantly expanded the timeline to add around 3x more events and milestones, while making the chapter more concise and information-dense. We also included a structured timeline in the end for easy reference.

PS: Remember you can get Mostly Harmless AI while in early access at a reduced price. We are now running a special offer that gives you the PDF and EPUB version of the book as it currently stands, plus guaranteed access to all future editions for just $5.

Get Mostly Harmless AI ($5)

Garry Kasparov vs Deep Blue in 1997, the first time a computer program beat a World Chess Champion. Taken from The Grandma’s Logbook.

For centuries, we humans have been captivated by the idea of a thinking machine. This isn’t some modern tech obsession; the dream of automatons and artificial minds is woven through our myths and philosophies. But the formal quest to build one began only in the mid-20th century, and its history has been a dramatic back-and-forth between two core, seemingly antagonistic approaches.

One path, rooted in the logic of rationalism, sought to build intelligence from the top down by programming explicit rules and symbols. The other, inspired by the biological empiricism of the brain, tried to create it from the bottom up by allowing machines to learn patterns from data and experience.

This chapter explores the history of Artificial Intelligence (AI) through the lens of this great intellectual tug-of-war. It is a journey through distinct eras, each defined by which philosophy was dominant, what external factors like computing power and data availability enabled its rise, and how these forces have finally begun to converge, leading us to the powerful tools we have today.

In the appendix of this book, we will present a detailed chronology of the most important milestones in the history of artificial intelligence.

The Foundational Era (1940s - 1960s)

The dawn of AI was a time of immense optimism, where the very concept of a “thinking machine” was formalized. Even before the field had a name, its philosophical and theoretical groundwork was being laid. In his seminal 1950 paper, “Computing Machinery and Intelligence,” Alan Turing proposed the Turing Test, setting a profound, long-term goal: to create a machine whose conversation was indistinguishable from a human’s. In parallel, the work of Warren McCulloch and Walter Pitts in 1943 on the first mathematical model of an artificial neuron planted the seeds of the connectionist dream—the idea that intelligence could emerge from simple, brain-like units.

When the field was officially christened at the Dartmouth Workshop in the summer of 1956, the symbolic, logic-based paradigm took the lead. Researchers believed that human thought could be mechanized, and the primary task was to build systems that could manipulate symbols according to formal rules. This vision was solidified by the creation of the LISP programming language in 1958, a tool perfectly suited for this symbolic manipulation.

Yet, in that same year, the connectionist counterpoint took physical form. Frank Rosenblatt developed the Perceptron, the first artificial neural network that could learn to classify patterns on its own, offering a tangible, bottom-up alternative to pure logic.

The public imagination was quickly captured by early demonstrations of AI’s potential. The Unimate (1961), the first industrial robot, showed that machines could perform physical labor. Shakey the Robot (1966) took this a step further, becoming the first mobile robot to perceive its environment and reason about its own actions. Joseph Weizenbaum’s ELIZA (1964), a simple chatbot that simulated a psychotherapist, revealed how easily humans could attribute intelligence and understanding to a machine. But this initial optimism soon collided with reality.

The ambitious promises of creating true intelligence went unfulfilled, and in 1969, the publication of the book Perceptrons by Marvin Minsky and Seymour Papert delivered a critical blow. By rigorously detailing the mathematical limitations of simple neural networks, the book effectively starved the connectionist school of funding, ushering in the first “AI Winter” and ensuring that the symbolic approach would dominate the field for the next decade.

The Knowledge Era (1970s - 1980s)

With connectionism on the back burner, the field regrouped around a more pragmatic goal: instead of trying to create general intelligence, researchers focused on capturing and mechanizing human expertise in narrow domains. This led to the golden age of expert systems, the first commercially successful form of AI. The core idea was to interview a human expert, painstakingly encode their knowledge into a vast set of “if-then” rules, and use a reasoning engine to produce solutions.

This approach yielded impressive results. SHRDLU (1972) was a landmark natural language program that could understand and respond to commands about a simulated world of blocks, showcasing a new level of sophistication for symbolic AI. Expert systems like MYCIN (1972) could diagnose blood infections as accurately as junior doctors, while others like DENDRAL and PROSPECTOR found success in chemistry and geology. This culminated in the first true commercial boom, as companies like Digital Equipment Corporation used the XCON system (1980) to configure complex computer orders, saving millions of dollars. The ambition of this paradigm reached its peak with the Cyc project (1984), a monumental effort to manually encode all of human common sense knowledge into a single, massive database.

While the symbolic school reigned, a connectionist undercurrent continued to flow. In Japan, Kunihiko Fukushima’s work on the Neocognitron (1980) created a hierarchical, multi-layered neural network for visual recognition that was the direct ancestor of the architectures that would dominate computer vision decades later. And in 1986, the popularization of the backpropagation algorithm provided an efficient method for training these deeper networks, solving a critical problem that had plagued the field for years.

However, the symbolic paradigm’s dominance was destined to end. Expert systems were incredibly brittle; they were expensive to build, nearly impossible to update, and would fail completely if faced with a situation not explicitly covered by their rules. The hype, fueled in part by Japan’s ambitious Fifth Generation Computer Systems project (1982), once again outpaced reality. When the specialized hardware market collapsed in 1987, the field plunged into its second “AI Winter,” leaving the promise of AI unfulfilled once more.

The Internet Era (1990 - 2011)

The end of the second AI winter was not driven by a single algorithmic breakthrough, but by two external forces that changed everything: the public launch of the World Wide Web in 1991 and the invention of the Graphics Processing Unit (GPU) in The web began generating an unimaginable ocean of data—text, images, and user interactions. The GPU, particularly after the release of NVIDIA’s CUDA platform in 2007, provided a way to perform the massive parallel computations needed to learn from that data. These two catalysts—data and computation—created the perfect conditions for the statistical, learning-based paradigm to finally thrive.

Before deep learning took hold, this new environment fueled the rise of “shallow” machine learning. Algorithms like Support Vector Machines (SVMs) (1995) became dominant, and open-source libraries like scikit-learn (2007) made them accessible to a wide audience. This approach had a massive real-world impact, powering the recommender systems of companies like Amazon and sparking global competitions like the Netflix Prize (2006). The infrastructure to handle this new scale was built in parallel, with Google’s MapReduce (2004) providing the blueprint for big data processing.

During this time, foundational work in reinforcement learning was also bearing fruit. TD-Gammon (1992) showed that a program could teach itself to play backgammon at a superhuman level, and the textbook by Sutton & Barto (1998) codified the field for a new generation. The seeds for the coming deep learning revolution were being sown with the invention of key architectures like LSTMs (1997) and LeNet-5 (1998), while the creation of the massive ImageNet dataset (2009) provided the high-quality benchmark that would soon ignite it.

AI also became a tangible part of public life. The symbolic paradigm had its last great public triumphs with Deep Blue’s victory over Garry Kasparov in chess (1997) and Watson’s win on Jeopardy! (2011). But the future belonged to learning-based systems. Dragon NaturallySpeaking (1997) brought continuous speech recognition to consumers. Competitions like the DARPA Grand Challenge (2004) spurred the development of autonomous vehicles. Consumer products like the Roomba (2002) and Microsoft’s Kinect (2010) brought robotics and computer vision into millions of homes. With the launch of Siri in 2011, a conversational AI assistant was finally in everyone’s pocket.

The Deep Learning Era (2012 - 2018)

If the Internet Era set the stage, 2012 was the year the curtain rose on the deep learning revolution. In October, a deep convolutional neural network called AlexNet, trained on GPUs using the ImageNet dataset, shattered all previous records in the annual image recognition competition. This “ImageNet Moment” proved the overwhelming superiority of deep, data-driven learning and kicked off a Cambrian explosion of breakthroughs.

This period was a stunning validation of what AI researcher Rich Sutton would later call The Bitter Lesson: that general methods leveraging massive computation almost always outperform approaches that rely on hand-crafted human knowledge. The field progressed at a breathtaking pace. In natural language processing, Word2Vec (2013) provided a powerful way to represent the meaning of words as vectors. In generative AI, Generative Adversarial Networks (GANs) (2014) introduced a novel way to create stunningly realistic synthetic images. New architectures like ResNet (2015) allowed for the creation of networks hundreds of layers deep, solving a fundamental barrier to scale.

These new techniques allowed AI to achieve superhuman performance in increasingly complex domains. Deep Q-Networks (DQN) (2013) learned to master classic Atari games directly from pixels, and in a landmark event in March 2016, AlphaGo defeated Lee Sedol, the world’s greatest Go player. The revolution was powered by a new generation of open-source tools like TensorFlow (2015) and PyTorch (2016) that democratized deep learning, as well as specialized hardware like Google’s Tensor Processing Units (TPUs) (2016). The era culminated with the invention of the Transformer architecture in 2017.

However, 2018 marked a turning point. In March, the Cambridge Analytica scandal revealed how machine learning algorithms, fed by the personal data of millions of Facebook users, had been used for political manipulation, sparking a global reckoning over data privacy and the ethics of AI.

That same month, though, the scientific community formally recognized the field’s impact, awarding the Turing Award to Geoffrey Hinton, Yann LeCun, and Yoshua Bengio for their foundational work. The Turing Award–aptly named after the most important figure in the history of Computer Science as a whole, let alone Artificial Intelligence–is the most prestigious academic award in computing, akin to the Nobel Prize.

The Generative Era (2019 - Present)

The current era is defined by the application of the Transformer architecture at an unprecedented scale. By training these models on vast swaths of the internet, researchers discovered that quantitative leaps in size and data could lead to qualitative leaps in capability, resulting in models with emergent generative and reasoning abilities that have captured the world’s attention.

The first sign of this new power came with GPT-2 in 2019, whose ability to generate coherent text was so advanced that its release was initially staged due to safety concerns. Its successor, GPT-3 (2020), demonstrated that massive scale could unlock “few-shot” learning, the ability to perform tasks it was never explicitly trained on.

Soon, this generative power was applied beyond text to images with DALL-E (2021) and to code with GitHub Copilot (2021). But the cultural tipping point arrived in November 2022 with the release of ChatGPT. Its simple, conversational interface made the power of Large Language Models (LLMs) accessible to millions, sparking a global phenomenon and a new wave of investment.

This boom was accompanied by a powerful open-source counter-movement. The release of the image-generation model Stable Diffusion (August 2022) and Meta’s Llama models (2023) democratized access to powerful foundation models, sparking a “Llamaverse” of community-driven innovation. The field is now a global race, with major competitors like Anthropic’s Claude (2023), Google’s multimodal Gemini (December 2023), and China’s DeepSeek R1 (January 2025) demonstrating capabilities on par with the best proprietary systems.

Perhaps the most profound impact of this new era has been in science. In November 2020, AlphaFold 2 solved the 50-year-old grand challenge of protein folding, a breakthrough of such significance that its creators were awarded the Nobel Prize in This demonstrated that AI could be a tool not just for automating tasks, but for accelerating fundamental scientific discovery. The road ahead now points towards more autonomous, “agentic” systems, where the AI transitions from a single-response tool to a collaborator capable of executing complex, multi-step tasks on our behalf.

Conclusion

We’ve journeyed through decades of ambition, breakthroughs, and tough realizations. What we’ve seen is a constant back-and-forth, a dynamic dance between two powerful ideas: the precise, rule-based, inflexible logic of symbolic AI and the adaptable, pattern-based, unreliable power of statistical AI. This dance, as we’ve explored, often mirrors the philosophical tension between rationalism and empiricism.

Today, AI stands at a fascinating crossroads. The purely statistical systems that define the generative era have achieved incredible feats. Yet, we are beginning to see diminishing returns. With GPT-4 having been a high point, many newer models have made only incremental progress, suggesting that simply scaling the existing paradigm may not be enough to achieve the next level of intelligence. This has led some to speculate that we may be on the brink of a third “AI Winter,” as the hype once again outpaces the reality of the technology’s capabilities.

The recent focus on reasoning models and agentic systems seems capable of fueling the statistical hype a bit longer, but a growing number of researchers are realizing we may not achieve Artificial General Intelligence (AGI) purely by scaling. This brings us to a crucial realization: the future of AI likely isn’t about one approach winning out over the other, but about intelligently combining them. The inherent limitations of today’s models—their unreliability and lack of true reasoning—have sparked a renewed interest in the long-neglected symbolic paradigm. Hybrid approaches, particularly neuro-symbolic AI, which seek to integrate the pattern-matching strengths of neural networks with the rigorous logic of symbolic systems, hold immense potential for creating the next breakthrough.

Whether a winter is coming or not, it is indisputable that AI has already had a profound impact on society and will continue to do so. The history of this field is far from finished. It is a living, breathing, civilization-wide project with the potential to transform society for the better or, some believe, to become our ultimate doom. Everyone has a place here: technologists, yes, but also humanists, economists, historians, artists, and policymakers. The next few years promise to be extremely exciting, and you can be a part of shaping what comes next.

Thanks for reading! Below you’ll find the full expanded timeline. Please let me know if you think I missed something or made any mistakes. All feedback is appreciated!

PS: Claim your copy of Mostly Harmless AI for only $5 in the link below.

Get Mostly Harmless AI ($5)

Appendix: A Chronology of Artificial Intelligence (1956-2025)

This timeline details the key breakthroughs, conceptual shifts, and landmark achievements in the field of Artificial Intelligence, tracing its path from a niche academic discipline to a transformative global technology.

The Foundational Era (1940s - Late 1960s)

(Science) 1943: The First Artificial Neuron is proposed by Warren McCulloch and Walter Pitts, laying the theoretical foundation for connectionism.
(Science) October 1950: Alan Turing publishes “Computing Machinery and Intelligence,” introducing the Turing Test.
(Social) Summer 1956: The Dartmouth Workshop is held, where John McCarthy coins the term “Artificial Intelligence,” formally establishing the field.
(Tech) 1958: Frank Rosenblatt develops the Perceptron, the first artificial neural network capable of learning.
(Tech) 1958: John McCarthy develops the LISP programming language, which becomes the standard for symbolic AI.
(Product) 1961: The Unimate industrial robot begins work on a General Motors assembly line.
(Product) 1964: Joseph Weizenbaum creates the chatbot ELIZA at MIT.
(Tech) 1966: The Stanford Research Institute (SRI) develops Shakey, the first mobile robot to reason about its own actions.
(Social) 1969: The publication of Perceptrons by Marvin Minsky and Seymour Papert marks the beginning of the first “AI Winter.”

The Knowledge Era (1970s - 1989)

(Tech) 1972: Terry Winograd develops SHRDLU, a groundbreaking natural language understanding program.
(Tech) 1972: The logic programming language Prolog is created by Alain Colmerauer and Philippe Roussel, becoming a key tool for symbolic AI.
(Tech) 1972: Stanford University develops the MYCIN expert system for medical diagnosis.
(Science) 1974: Marvin Minsky publishes his influential paper on “Frames” theory, a new paradigm for knowledge representation.
(Tech) Late 1970s: Expert systems like DENDRAL (for chemistry) and PROSPECTOR (for geology) demonstrate success in specialized scientific domains.
(Science) 1980: Kunihiko Fukushima develops the Neocognitron, an early hierarchical neural network that is the direct ancestor of modern Convolutional Neural Networks (CNNs).
(Product) 1980: Digital Equipment Corporation begins using the XCON expert system, marking a high point for commercial AI.
(Social) 1982: Japan’s Ministry of International Trade and Industry begins the Fifth Generation Computer Systems project, a massive initiative to build a new generation of computers based on logic programming, sparking competitive AI investment worldwide.
(Tech) 1984: The Cyc project is initiated by Douglas Lenat, an ambitious attempt to manually encode all of human common sense knowledge into a single knowledge base.
(Science) 1986: The backpropagation algorithm is popularized by Geoffrey Hinton, David Rumelhart, and Ronald Williams.
(Social) 1987: The collapse of the LISP machine market signals the start of the second “AI Winter.”

The Internet Era (1990 - 2011)

(Social) August 1991: The World Wide Web project is released to the public, creating the infrastructure for the data explosion that would fuel modern AI.
(Science) 1992: Gerald Tesauro develops TD-Gammon, a backgammon program that trains to a superhuman level using reinforcement learning, a landmark for the field.
(Social) 1995: Stuart Russell and Peter Norvig publish “Artificial Intelligence: A Modern Approach,” which becomes the leading textbook in the field for decades.
(Science) 1995: The Support Vector Machine (SVM) algorithm is popularized by Corinna Cortes and Vladimir Vapnik.
(Social) May 1997: IBM’s Deep Blue defeats world chess champion Garry Kasparov.
(Science) 1997: Sepp Hochreiter and Jürgen Schmidhuber invent the Long Short-Term Memory (LSTM) network.
(Product) 1997: Dragon NaturallySpeaking is released, becoming the first widely available continuous speech recognition software for consumers.
(Product) September 1998: Google is founded, and Amazon patents its item-to-item collaborative filtering, marking the start of large-scale data-driven AI applications.
(Science) 1998: Richard Sutton and Andrew Barto publish “Reinforcement Learning: An Introduction,” a seminal textbook that codifies the field.
(Tech) November 1998: Yann LeCun and his team develop LeNet-5, a pioneering Convolutional Neural Network (CNN).
(Tech) August 1999: NVIDIA releases the GeForce 256, marketed as the world’s first Graphics Processing Unit (GPU).
(Product) November 2000: Honda unveils its ASIMO humanoid robot, a landmark in robotics and motion planning.
(Product) September 2002: iRobot releases the Roomba, the first commercially successful autonomous home robot.
(Tech) 2004: Google publishes its paper on MapReduce, a programming model for processing massive datasets that becomes foundational to big data infrastructure.
(Social) March 2004: The first DARPA Grand Challenge for autonomous vehicles is held, sparking a new wave of research in self-driving technology.
(Social) October 2006: The Netflix Prize competition is launched, galvanizing research in recommender systems.
(Science) 2006: Geoffrey Hinton develops Deep Belief Networks, introducing effective strategies for unsupervised layer-wise pre-training.
(Tech) June 2007: NVIDIA releases CUDA, a parallel computing platform that allows developers to use GPUs for general-purpose processing.
(Tech) June 2007: David Cournapeau develops scikit-learn as a Google Summer of Code project.
(Science) 2009: A Stanford team led by Andrew Ng publishes a paper showing that GPUs can make training deep neural networks 10-100 times faster.
(Tech) 2009: The ImageNet dataset is created by Fei-Fei Li’s team at Stanford.
(Product) November 2010: Microsoft releases the Kinect, a consumer device that brings sophisticated real-time computer vision into millions of homes.
(Social) February 2011: IBM’s Watson wins the quiz show Jeopardy!.
(Product) October 2011: Apple integrates Siri into the iPhone 4S, making conversational AI assistants a mainstream consumer product.

The Deep Learning Era (2012 - 2018)

(Social) April 2012: Coursera is founded, and Andrew Ng’s Machine Learning course begins to democratize AI education.
(Science) June 2012: The Google Brain “Cat Neuron” project demonstrates that a neural network can learn high-level concepts from unlabeled data.
(Social) October 2012: AlexNet, a deep CNN trained on GPUs, wins the ImageNet competition by a massive margin, officially kicking off the deep learning revolution.
(Tech) 2013: Google researchers led by Tomas Mikolov release Word2Vec, a highly efficient method for creating word embeddings that revolutionizes NLP.
(Science) December 2013: DeepMind publishes its work on Deep Q-Networks (DQN), demonstrating an AI that can learn to play Atari games at a superhuman level from raw pixels.
(Science) June 2014: Ian Goodfellow and his colleagues introduce Generative Adversarial Networks (GANs), sparking a revolution in generative AI for images.
(Science) December 2015: A team at Microsoft Research introduces Deep Residual Networks (ResNet), allowing for the training of much deeper neural networks.
(Tech) November 2015: Google releases the TensorFlow open-source library, making deep learning more accessible.
(Social) March 2016: Google DeepMind’s AlphaGo defeats world Go champion Lee Sedol.
(Tech) May 2016: Google announces it has been using custom-built Tensor Processing Units (TPUs), specialized hardware for deep learning, in its data centers.
(Tech) September 2016: Facebook AI Research (FAIR) releases PyTorch, which becomes a major deep learning framework.
(Science) June 2017: Researchers at Google publish “Attention Is All You Need,” introducing the Transformer architecture.
(Social) March 2018: The Cambridge Analytica scandal breaks, revealing that the personal data of millions of Facebook users was used for political advertising, sparking a global conversation on data privacy and the ethics of machine learning.
(Social) March 2018: Geoffrey Hinton, Yann LeCun, and Yoshua Bengio are awarded the ACM Turing Award for their foundational work on deep learning.
(Social) December 2018: DeepMind’s AlphaFold makes its stunning debut at the CASP13 competition.

The Generative Era (2019 - 2025)

(Tech) February 2019: OpenAI announces GPT-2 but initially withholds the full model due to safety concerns.
(Product) November 2019: OpenAI releases the full version of the GPT-2 model.
(Product) June 2020: OpenAI releases GPT-3 via a private API.
(Social) November 2020: AlphaFold 2 achieves revolutionary accuracy at the CASP14 competition, effectively solving the protein folding problem.
(Product) January 2021: OpenAI introduces DALL-E, a model that generates images from text.
(Product) June 2021: GitHub Copilot is launched as a technical preview.
(Tech) April 2022: Google announces its Pathways Language Model (PaLM).
(Social) June 2022: Google engineer Blake Lemoine publicly claims the LaMDA model is sentient.
(Product) August 2022: The open-source release of Stable Diffusion democratizes high-quality image generation.
(Product) November 2022: OpenAI releases ChatGPT to the public.
(Tech) February 2023: Meta releases the first Llama model to the research community.
(Product) March 2023: Anthropic releases its first Claude model.
(Tech) July 2023: Meta releases Llama 2 with a commercial-use license, sparking the open-source “Llamaverse.”
(Product) December 2023: Google releases Gemini, its first natively multimodal model.
(Social) October 2024: Nobel Prizes are awarded to Geoffrey Hinton, John J. Hopfield, Demis Hassabis, and John Jumper for their work in AI.
(Product) January 20, 2025: DeepSeek AI releases its DeepSeek R1 model and chatbot, marking a turning point in the global AI race.
(Product) August 2025 (GPT-5): OpenAI releases GPT-5, with a focus on more autonomous, “agentic” capabilities, and a rather underwhelming reception.

What? Still here? Ok, here’s another nice button for you to click. Thanks!

William Lees

Aug 13

I see your point about not talking about popular culture in a chapter about science.

You have done a great job identifying the events and generations and linking them into a narrative in the text. That said, I don't think you need the outline at the end anymore. What do you think?

Expand full comment

1 reply by Alejandro Piad Morffis

Aug 12

Also, it's hard to imagine a history of AI without including:

The Three Laws of Robotics, featured in I, Robot by Isaac Asimov https://en.wikipedia.org/wiki/Three_Laws_of_Robotics

and

HAL, the murderous AI from 2001, A Space Odyssey, first appearing in the novel by Arthur C. Clark in 1968. https://en.wikipedia.org/wiki/2001:_A_Space_Odyssey_(novel)

8 more comments...

Discussion about this post

Ready for more?