# Top-Down and Bottom-Up Learning

### A short essay on the advantages and limitations of the two most basic learning approaches and how to mix them for crafting an optimal learning experience.

The two predominant learning paradigms in almost every course, tutorial, career, and book are *bottom-up* and *top-down*. Their difference is in the order in which we introduce principles and applications. In *bottom-up learning*, we start from the principles and build up toward the applications. In contrast, *top-down learning* begins with a desired application and uncovers the principles as they are necessary.

Of course, *a priori*, neither way is strictly better than the other. Both have advantages and limitations. I have used both in different combinations in my classes, courses, and other learning experiences I’ve designed. In this essay, I will discuss when these two approaches fit best and how to mix them for an optimal learning experience.

Bottom-up learning is very efficient and practical for massive education. If you’re building a college curriculum, you can define the theoretical principles supporting all skills and applications you want your students to learn. You can then design a learning path that efficiently covers all the theoretical details necessary for each subject matter. For example, before machine learning, you teach the mathematical principles behind calculus and algebra. This also allows the learning materials and evaluations to be more standardized, making it easier to deploy in large educational institutions. It doesn’t matter if your major is physics or chemistry; the calculus class is probably the same. Thus, this approach is preferred in any centrally designed educational system. It is an efficient way for students to understand the whole material, as each course element is touched only once and explained in depth before it is necessary. And it’s an efficient way to schedule and optimize your resources (teachers, classrooms, exams) at scale.

But trying to teach an unjustified theory to students can be difficult. One must constantly tell the students, “Please believe me, this is useful; you'll see why later.” Furthermore, since you have no grounding applications, to begin with, your evaluations tend to be artificial —e.g., solving abstract equation systems—leaving the students feeling like they are being forced to learn things that seem to have no use. This can be frustrating for the students as they are expected to understand and remember theories that may have no immediate relevance to their lives.

In contrast, Top-down learning is a great way to keep students engaged. By having a clear problem in mind and then discovering the theory piece by piece, students can relate to and understand the importance of the theory to make the solution work. Seeing the application first also makes the theory easier to comprehend and remember than when dealing with, e.g., some abstract theorems. When students see the theory applied to a concrete problem, they can understand intuitively why the theory works before going through the hassle of proving it. The application also helps motivate and inspire students and sparks an interest in the studied subject, presumably because the application is something they care about. For example, try to build a chatbot first, diving into language modeling as you discover its necessity. This approach is often used in short tutorials and online courses. The desired result is a concrete application since you can skip many elements of the underlying theory and focus on the most useful ones.

However, for the same reason, it can be hard to justify going too deep into the underlying principles and processes when learning purely for an application. Instead, the goal is to gain enough knowledge to understand and use the application effectively; and then you move on. Repeating this approach for many related problems can lead to a disconnect between the individual pieces of information, as no grand theory unifies them. This is why it can be beneficial to delve a little deeper, as this may provide a unifying framework that ties all the pieces together. For example, learning algebra allows us to understand how operators work on different mathematical objects, while learning logic will enable us to unify all forms of reasoning. Going the extra step to uncover the more profound principles can be worthwhile in the long run, even when it is not strictly necessary for the application.

So which is better in which case? I don't have any proven answers here, but I sense that the larger the scale, the more valuable the bottom-up approach becomes.

When designing a full curriculum, it is essential to consider the relationships between principles and applications. Visualizing the whole dependency graph of principles and applications makes it easier to organize the curriculum to ensure overarching principles are only covered once while allowing them to be applied and reused throughout the curriculum. This helps ensure that students are exposed to the relevant principles and have the opportunity to understand and use them in various contexts.

However, in small-scale learning projects, such as a single class, I believe is better to begin with an intriguing problem and then move on to understand the theory and techniques necessary for its resolution. Immersing oneself in a subject matter or problem first makes staying motivated and engaged in the learning process easier. Additionally, it allows the learner to be more familiar with the subject by the time they reach the theoretical aspects of the project, making it easier to absorb the necessary information.

Things get interesting at the mid-level when designing, e.g., a complete course. Here, I have found that a mix of practical and theoretical approaches gives you the best of both worlds: a hybrid model that maintains student engagement and ensures that all necessary foundations are firmly established. This hybrid model consists of alternating lectures between surface applications and deep theory. You start with a motivating application and highlight what you need to understand further. Then you take a deep breath and spend three or four lectures studying the underlying theory. Once students have a solid grasp, they switch gears again and devote a couple of lectures to direct applications, solving the initial problem as if by sticking parts of a puzzle. There’s an *aha!* moment there where the theory pays off, and you have gained your students’ trust so that you can push them again into another theoretical rabbit hole. By alternating between these two activities, students feel they are consistently progressing and can better understand how theory-based concepts directly relate to their desired applications.

Take, for example, my compilers course. I begin by introducing the grand vision of a compiler and the key milestones we need to hit to build it: tokenizing, parsing, semantics, and code optimization. This is the motivating application; they all want to know how this compiler is built. There I promise, *“Come dive with me for a while; when we’re out, you’ll see we’ll be much closer,”* and I immediately dive into formal language theory. I will do this three or four times in a classic 16 weeks course, which always pays off. We gain momentum as we hit each milestone, and students are even more eager to see what the next theoretical deep dive will uncover. But this workflow also allows them to understand better the theory underlying the compiler construction process because each piece of theory is grounded in some relevant application.

Sometimes this hybrid approach is hard to pull off. If the subject you’re teaching is very abstract, finding realistic, motivating examples may be almost impossible. But in most cases, what I’ve seen is that college courses don’t try hard enough. Algebra, calculus, differential equations, and probability, …, all these are subjects classically taught at a very abstract and flavorless level. And all of these can be significantly improved with just a few motivating applications sprinkled here and there. If you try it, I can promise students will not only love it; they will learn better, which is the ultimate payoff.

I appreciate this insight. I’m working through a competency based MBA program and designed a study approach for my adhd brain

It would be interesting to know what would be your options for practical projects in a regular 16-week course when teaching Probabilities.