On LLMs, Technical Writing, and Creativity
A free-to-watch webcast with Lance Cummings from Cyborgs Writing.
Hey!
Yesterday, I had a delightful conversation with
from on the general topic of language models and creativity. We went from a somewhat technical explanation of the most essential sampling parameters in LLMs (temperature
and top-p
) to practical advice on leveraging these tools to control the randomness or variability of the output, as well as more philosophical discussions on the nature of creativity and whether machines can ever be truly creative.You can watch the whole 1-hour conversation here. But if you prefer a TL;DR, here are the main takeaways.
TL;DR:
The two most important parameters to control how an LLM produces text (other than the actual prompt) are temperature
and top-p
. These two parameters are found in almost any LLM API provider, including OpenAI and competitors.
In short, temperature is a factor that controls the shape of the token distribution. Higher temperature values tend to flatten the probabilities, making the model more likely to pick from a wider range of tokens, thus making it more random. Lower values make the model pick the most likely continuations, reducing randomness.
On the other hand, the top-p parameter cuts the distribution after some number of tokens with an accumulated probability mass. The lower this value, the fewer tokens are available for the model to pick as a continuation. Here is a simple illustration of this explanation.
They seem similar, but they have different effects. The temperature is a soft control, which tends to make the model pick more or fewer tokens, but any token still has a chance to be selected. In contrast, top-p restricts the selection to just a handful of tokens, effectively constraining the model so it won’t pick the less probable tokens.
Thus, you can increase the temperature to make the model more random and decrease top-p at the same time to avoid it going too far off the rails.
Playing around with these two parameters is an excellent exercise to help you grok how LLMs actually “think” (spoiler alert, they don’t, at least not in any human-like way). I actually made a demo, and you can try it out to see this effect in action.
Finally, one of the best ways to leverage this randomness in practice to produce novel but quality ideas or solutions to problems is to use a diverge-converge workflow. You take any LLM, crank the temperature up to make it go wild, and generate 10 solutions to whatever problem you want. Then, take those 10 and make a new prompt asking the model—now with a sensible temperature—to evaluate them and pick the best ones.
But this is just scratching the surface of what Lance and I discussed. Please check the full video (it’s free) and leave any comments or questions below.