But what is Artificial Intelligence anyway?
The no-BS introduction to the field of Artificial Intelligence you didn't know you wanted.
This article is part of my upcoming book The Science of Computation, a journey across all of computer science, suitable for all readers. You can get an early access now with a 50% discount.
Artificial Intelligence is a loosely defined field in Computer Science, that deals with making computers solve, either entirely or at least partially, some of the most challenging problems we know. Some examples of the problems we tackle with AI are proving theorems, navigating complex worlds, crafting optimal schedules, or understanding and creating pictures and text.
As you can see, these problems are varied and vague. The one unifying factor of all these tasks is that we don’t know of any guaranteed strategy to make a computer solve them both effectively and efficiently, so what can we do about them?
This article is the first of a short introductory series to the field of Artificial Intelligence, what is it about, when is it useful, and why it matters.
Let’s start with the basics. At the bottom of any working piece of software, there are often many algorithms. An algorithm, for the purposes of this post, is just a series of very well-defined instructions that a computer can follow to solve a concrete problem.1
For example, if you want to sort a list of numbers, the easiest way to do it is to scan the list to find the smallest one, put it at the beginning of the list, and repeat until no elements are left unsorted. This is a rather slow, but certain way to sort any list of numbers, regardless of its size and content. And we have many other such algorithms, that work better or worse in specific circumstances. Sorting, thus, is mostly a solved problem.2
Solving a problem in Computer Science usually means finding a well-designed algorithm that we can formally prove works both reliably and fast.3 The kind of algorithms that are used throughout the vast majority of software deployed today: e.g., searching for some specific items in a database, sorting and grouping objects based on some criteria, or solving some fancy equations.
We have plenty of those algorithms for plenty of areas in Computer Science, but there are at least two broad areas of problems so hard, that we don’t have any good solutions.
Solving problems with AI
The first area concerns those problems that we believe are intractable. These are problems most experts believe simply can’t be solved efficiently, or at the very least, no one in Computer Science has a freaking clue if they can. We even have a fancy name for them: NP-Hard.
The gist of NP-Hard problems is that, if you solve just one of them efficiently, you would solve a huge number of other problems touching all areas of Computer Science, problems that hundreds of thousands of computer scientists have attempted to solve efficiently for decades.
One of the most popular examples, and probably the oldest, is the Travelling Salesman Problem (or TSP for short). In a common variation of the TSP, you’re given a bunch of cities connected by roads with some associated cost (maybe the traveling distance, or the price of a plane ticket). You have to visit all cities exactly once while minimizing the total cost of the tour.
The most intuitive solution is probably just going to the nearest city, and from there to the next nearest, and so on. This is called a “greedy strategy” and, as most grandmas know, it can be catastrophically bad.
There are exact solutions, of course. The easiest (and slowest) one consists in enumerating all possible tours and selecting the one with the lowest cost, what in CS-lingo we call a “brute force search”. Thing is, in general, no other strategy has been found that always gives you the best tour and that is qualitatively better (in terms of computational performance) than brute forcing it.4
What can we do in these cases? When no exact and fast algorithm can be found, we have to sacrifice either correctness or performance. Since we’re almost always interested in performance, we end up designing “approximate” solutions: solutions that are good enough, given the time and memory constraints we have. This is the realm of search algorithms.
The second area of hard problems involves problems that are unstructured. These may be worse than NP-Hard, in a sense: they have so little structure that we have no idea where to even begin a brute search.
Many of the problems in this domain are related to perception: e.g., how to distinguish a face in a photograph; others are related to higher cognitive functions that are equally opaque to our own minds, e.g., how to translate a text into another language, changing the actual words, but keeping the semantics, intentions, and style of the author; and others deal with many variables that have complex and hard to understand interactions, e.g., which are the best movies to recommend a given person.
All of them share a common property, though: we hint there are patterns that determine the correct result that we don’t know how to code explicitly. However, we may have access to solved examples, e.g., explicitly in the form of input/output pairs or implicitly in the form of simulations we can run.
In these cases, we want to find patterns in data, i.e., infer the rules that determine what a face is, the meaning of a phrase, or the things you want in a movie. In a way, the pattern is the algorithm, so the task is to discover the correct algorithm given examples of solved instances. This is the realm of learning algorithms.
There is a third area of interest for AI that complements these two. It deals with the problem of efficiently representing, storing, and using domain knowledge in a computationally tractable way.
This involves finding the correct way to organize concepts and facts, as well as the optimal way to reason about them and produce new concepts and facts that are sound. A prime example is ontologies, or the more general notion of knowledge graphs: data structures representing facts about certain entities and their relations. This is the realm of knowledge representation and reasoning.
These three paradigms—searching, learning, and reasoning—encompass what is known as Artificial Intelligence. The remainder of this article deals with the first paradigm, and future articles will deal with the other two.
Solving hard problems by search
Search is the general process of finding an object that satisfies certain properties among a (potentially huge) set of distinct objects. When learning to code, search is probably one of the first problems you will tackle: searching for an item in an array by looking at each element at a time is the quintessential example of an algorithm in many introductory programming classes. Throughout the typical Computer Science curriculum, students will learn to search efficiently by organizing items into various data structures, the ultimate embodiment of which is a database system.
However, searching is not limited to arrays or collections of objects explicitly defined. We can frame many, if not all, CS problems as search problems. Sorting, for example, can be framed as searching for a permutation of the elements with zero inversions. In this case, the space of objects we are searching is the set of all permutations of a given array. We call this an “implicit” search space because it is not explicitly given in the problem definition.
This example is, of course, moot: why would we ever attempt to sort this way when we have extremely fast and simple sorting algorithms like Quicksort and Mergesort? But here’s the kicker: sorting algorithms like these work fast because they split the space of all possible partitions into subspaces so that you can easily discard a large chunk. This is what we call a “structured” search. It’s like binary search, but in the space of permutations!5
The more profound lesson here is that if you understand the structure of the search space, you can search much more efficiently. All our fast algorithms for specific problems exploit some search space structure. The question then becomes, can we also do this for the hardest problems, where it seems we can’t find any such structure that guarantees a fast solution? Fortunately, the answer is yes… well, to some degree.
There are two major approaches to searching in the realm of hard problems: heuristics and metaheuristics.
Heuristic search
A heuristic is a rule-of-thumb strategy that exploits some known properties of a search problem to improve search performance. One of the most well-known and used heuristic search algorithms is A*, which powers the GPS routing app on all phones.
A* is a graph search algorithm that finds the shortest path between an origin and one or many equivalent destinations. At its core, it is very similar to Dijkstra’s shortest path algorithm, but one key idea makes it incredibly faster: it assumes some information about the location of destination nodes.
For example, in GPS routing, you don't exactly know, without a full search, what the shortest path to get someplace from your home is. But if that place is northeast of your position, assuming some reasonable city layout, it makes sense that roads going north or east are more likely to get you there faster than roads going to the west or the south.
Of course, this isn't always the case since there could be detours, highways, tunnels, etc. If you blindly go toward the destination, you could end up stuck in a dead end. However, this useful knowledge can be exploited intelligently to make search much more efficient if you're clever about it, and this is what A* does.
Instead of blindly following the most direct route, A* keeps track of all routes, even those that extend in the opposite direction. However, contrary to traditional graph search algorithms, it doesn't explore all routes with the same effort. Instead, when choosing the next node to explore, A* will consider with a higher probability those nodes that the heuristic favors—e.g., those that lie in a direction more closely aligned with the destination—but it takes care not to get stuck, and it's able to backtrack when necessary.
In the end, A* performs no worse than traditional graph search algorithms like Dijkstra's and often does it much better, provided your heuristic is well-designed.
Metaheuristics
Where heuristics are problem-specific strategies to leverage domain insights and accelerate the search in a concrete problem, metaheuristics are general-purpose search strategies that leverage knowledge about the search paradigm itself.
Metaheuristics are domain-independent search algorithms that can be applied in any case where nothing else works. Solving TSP is one of the most common examples, but their uses extend to anything from routing to logistics and planning.
A primer example of a metaheuristic approach to problem-solving is the field of evolutionary algorithms. These are computational strategies inspired by certain aspects of the biological process of evolution that attempt to extract key mechanisms that make evolution work and apply them in the context of, say, finding the optimal way to design a GPU.
The basic idea of all evolutionary algorithms is to maintain a "population" of solutions (e.g., different circuit designs for a GPU) that undergo cycles of “breeding” and “selection”, much like biological evolution. An example of a breeding scheme is taking two of these potential GPU designs and build a new design that incorporates some elements from either of the "parents". This creates a new solution that is, in some sense, an evolved version of previous solutions, which may make it better or worse.
Then, we apply some selection criteria, most often just evaluating the solutions on whatever problem you're solving—e.g., which GPU design works faster at certain benchmarks—and keep the best 50%. This cycle of breeding and selection is repeated, producing a sequence of increasingly specialized GPU designs that, almost like evolution, seems to magically discover quasi-optimal design elements just by sheer luck and relentless repetition.
Much more can be said about evolutionary algorithms and metaheuristics in general. Still, the overall idea is as simple as that: find effective search strategies that seem to work in almost all domains—finding inspiration in nature, engineering, social networks, etc.—and build computational strategies that leverage these insights.
What's next?
Searching is one of the three major problem-solving paradigms in AI and one of the most important topics in computer science in general.
The other two paradigms, learning, and reasoning, are two sides of one of the longest-standing open questions in AI: how important is domain knowledge versus general-purpose learning strategies. If you're interested in that topic—and who isn't, right?—you can check my series on the Foundations of Machine Learning. The first two articles are the most intuitive introduction you'll find around.
In future articles, we'll explore specific search algorithms that are really clever and can solve intractable problems very fast, albeit approximately, of course.
The definition of “algorithm” is slightly more nuanced. You need a finite number of formally-defined instructions that are guaranteed to produce the correct output for any valid input and always finish in a finite number of steps. The caveat is that, as anyone who’s ever coded knows, you can have a finite number of instructions that take an infinite number of steps for some inputs (e.g., get stuck in a while loop). You could also conceivably have an infinite number of instructions but I have no idea if that’s useful.
We’re talking about sorting in the simplest sense here, given a finite set of objects and a total ordering function, finding a permutation of the objects that has zero inversions in that ordering.
By reliably, we mean that it always gives the correct solution. And by fast, we often mean a polynomial-time algorithm, but depending on the nature of the problem, some polynomial-time algorithms might also be impractical.
Again, there is space for nuance here. We have approximate algorithms for the TSP that are pretty fast, and we have exact and fast algorithms for some special cases. TSP is one of the most well-studied problems in Computer Science, so there’s a lot of progress in practical solutions. However, the theoretical bound still holds, no general-purpose solution works in polynomial time for all instances of TSP.
Left as an exercise for the math-savvy reader: proof that indeed log(n!) = O(n log n).
Thank you very much for these explanations Alejandro. For those without a background in computer science, not only can these things finally find an easy explanation, but they are also fascinating to read! I look forward to reading the next parts of this introduction to AI.