How to Train your Chatbot
Announcing my upcoming practical handbook on using LLMs to build all sorts of cool stuff
Hey, just a quick note to announce the upcoming release of my new book, How to Train Your Chatbot. Here is an excerpt from the preface:
In November 2022, the world was introduced to ChatGPT, a large language model that quickly became the fastest-growing digital product in Internet history. This groundbreaking technology marked a significant milestone in the 60-year-old field of artificial intelligence, providing users with an experience of interacting with something that looks like a truly intelligent computer. ChatGPT, developed by OpenAI, is a prime example of a large language model, which is a type of artificial intelligence model designed to understand and generate human-like text based on the input it receives.
Language models have been in development for many years, with significant advancements made recently. These models are trained on vast amounts of data, enabling them to generate coherent and contextually relevant responses to a wide range of prompts. The development of these models is a testament to the progress made in artificial intelligence, and their potential applications are vast and varied.
This book will dive into the world of language models, focusing on large language models and their transformative potential. We will explore the inner workings of these models, their capabilities, and the limitations that come with them. By understanding how these models function, we can better appreciate their potential and learn how to use them effectively in various applications.
The book is designed for anyone who wants to learn how to use large language models (LLMs) to build practical applications. The book is suitable for anyone with basic programming skills, and we will not use any third-party frameworks or libraries beyond the OpenAI API. This means that what you will learn is universal to all chatbot APIs, and you can quickly adapt it to any existing framework.
The main goal of this book is to teach you how to use LLMs in practice without diving too deep into the technical details of how they work. We will cover the most essential techniques related to chatbots and LLMs, including standard prompt engineering techniques, several augmentation methods, and fine-tuning.
Throughout the book, we will build a dozen or so applications from scratch, using LLMs and various techniques to ask questions about your documents, extract knowledge from natural text, and create stories automatically. Whatever your business domain or area of interest, from building user-facing chatbots to interact with your SaaS product to creating useful tools for office work or research, I promise you’ll find something helpful in this book.
The book is divided into three parts:
Part 1: Language Models
This section provides a comprehensive understanding of language models, their principles, inherent limitations, and the current state of the art. We will cover the following topics:
Understanding Language Models: We will introduce the concept of language models, their history, and their evolution over time. We will discuss how these models are trained and the basic principles that guide their operation.
Capabilities of Language Models: Here, we will explore the range of tasks that language models can perform, from simple text generation to more complex tasks such as translation, summarization, and question-answering.
Limitations of Language Models: While language models have made significant strides, they have limitations. We will discuss the challenges that remain in developing applications based on large language models, including issues related to bias, explainability, and generalization.
Part 2: LLM Techniques
In this section, we will discuss three families of techniques that can be employed to harness the power of language models in applications:
Prompting Techniques: These techniques involve carefully designing inputs (called prompts) to steer how language models generate responses. Often, the difference between an almost perfect and a mediocre response is in the quality of the prompt. The different prompt techniques we will learn here allow us to transform an otherwise generic LLM into a functional and very specific answer engine.
Augmentation Techniques: Language models can be enhanced by connecting them with other tools and resources, such as knowledge bases, APIs, and coding sandboxes. These techniques enable language models to access external information, improving their ability to generate accurate and contextually relevant responses, and integrating them into existing applications.
Fine-Tuning Techniques: Fine-tuning involves extending the capabilities of language models by directly modifying their weights and/or architecture. This can be done efficiently without requiring the training of models from scratch, enabling the rapid development of more advanced language models or specializing smaller models in domains where even larger models don't work as well.
Part 3: Applications
The final and largest part of the book is dedicated to building applications that leverage large language models. We will create a dozen or so applications, ranging from simple chat-based tools to more complex systems, demonstrating the potential of these models in real-world scenarios. Some of the applications we will build include:
Chatbots: These are typical chatbots that can engage in meaningful conversations with users, provide information, and answer questions in traditional scenarios such as customer service.
Text Analysis Tools: Tools that summarize long texts, extract insights, and answer specific questions while providing references.
Coding Helpers: Coding assistance tools that can generate, modify, explain, and debug code and translate code between different programming languages.
Data Analysis Tools: Tools that can analyze structured data, such as tables, and provide accurate analytics, including predictions and visualizations.
Writing Assistants: Tools that can enhance your writing providing ideas, outlines, and full drafts, as well as editing and criticizing existing text.
Research Assistants: Tools that can search the web for some specific information and build semi-structured reports on a given user question.
Story Generators: Tools that can generate fictional stories complete with characters, dialogs, settings, and plots, in different styles and genres.
And many more...
By the end of this book, readers will have a solid grasp of language models, their capabilities, and the techniques required to build applications that leverage their power. They will also have the knowledge and skills to make informed decisions when choosing frameworks and implementing solutions. And you will also have a dozen prototype projects that you can extend and turn into your own products or showcase in your next coding interview.
As usual, I will publish most of the content as free articles here in this blog. You can also buy early access to the book to get not only extended versions of these chapters as soon as they are available but also full access to the source code of all applications with a permissive license that lets you reuse it for anything you want, so you can start hacking your next unicorn idea right away.
I plan to write the book during the remainder of the year and have the first full version ready in early 2024. Getting early access now means you’ll get each chapter—and code—as it comes, and you can help steer the book's development.
Click the following link to secure your copy at 1/3rd of the final price.
Full disclaimer: There is no practical content available yet, just the preface and around 20 pages of introductory material. Buying the early access now grants you access to all future updates. The first few applications are already in the making and should be ready in a couple of weeks along with full source code.
Paid subscribers of Mostly Harmless Ideas will get a link for a 100% discount on the early access version at the end of this article.
I can't wait for you to start training your own chatbot and building exciting applications with LLMs!
Keep reading with a 7-day free trial
Subscribe to Mostly Harmless Ideas to keep reading this post and get 7 days of free access to the full post archives.