6 Comments
User's avatar
Terry underwood's avatar

This work is so rich it’s hard to know where to start! You’ve just blown up my mental model of LLMs and forced me to rethink everything, Alejandro. Thanks!! As your three year old daughter taught us all, we can always build new castles after a wave washes one away:) nice note. So is a few thousand tokens the limit for a sustained chat?

Expand full comment
Alejandro Piad Morffis's avatar

Thanks Terry, you're always too kind ;) Technically you can sustain a continuous chat for close to 100 thousand tokens for most frontier models, but what will happen in large chats is the nuance gets lost the more you talk, because models are very good at needle-in-a-haystack retrieval (pinpointing one specific claim in this large past context) but very bad at getting a general overview of a large context, which is almost always what you want in long conversations.

Expand full comment
John Dietz's avatar

Wow, I’m doing something very similar!

Expand full comment
Alejandro Piad Morffis's avatar

Tell me more!

Expand full comment
John Dietz's avatar

I’m more on the filmmaking side of things, but techie enough to get into trouble, haha.

The pipeline I’m doing is mainly different than yours because I’m relying on training the LLM models for the characters. So they have context of their own character knowledge and experience, so I’m relying on fine tuning for each character.

I also have a story world LLMmodel, and a storytelling llm model. Storyworld has the rules and logic of the world, storytelling is more about knowledge and rules of storytelling itself.

There is also a traffic cop agent that delegates to all these LLM models as their own agents, including a human who pushes and prods as the story needs.

The idea is the characters reflect before they act or talk (from that agent whitepaper), so the traffic can take character reflections and prod the storyworld, the story telling LLM’s and the human user, if the character LLM’s need some nudging or some sort of context that may make foe a better storytelling interaction.

The traffic cop sends any context back to the character LLM’s before they act or talk. So all these back and forth’s are queued, and run way way slower then realtime with the human involved.

A fundamental ideas is long term memory and goals are fine tuning the models, rag is for medium term, and context memory is for for short term. The traffic cop and human decide when actions or dialogue go from short term to rag, and I’ll set up some schedule system to use rag to update fine tunes.

This whole process is setup for storytelling purposes, a tool for the user to write better, not really anything scientific to study character behavior in AI or anything.

I’m also busy with image training for design of characters and world, so it’s all going bit by bit.

Expand full comment
Dustin Mattison's avatar

I have been building a teaching app using an AI assistant. I don’t know how to code myself. How accessible is your research to someone like me who is trying to build by relying on AI coding assistants?

Expand full comment