11 Comments

I'm looking for a reinforcement API with an interface. I.e. I code my use case as class implementing the specified interface, and then pass that off to the API. My use case is backgammon variations. Specifically, hunting for variants that highly reward skill.

Expand full comment

Unfortunately I haven't done a lot in RL in the last few years so I'm not sure if there's anything like a scikit-learn for RL with a high level, generic API. All the RL code I've seen in research (mostly related to LLMs these days, not traditional game playing) is based on pytorch and pretty low level.

Expand full comment

Ok, thx. If you come across anything ...

Expand full comment

For sure....

Expand full comment

I usually just ask Copilot, but if you fancy competing: :-)

An LLM has to be trained on a vast amount of data which is then locked in. When I ask it a question and start a thread the context builds as the conversation goes to and fro. Where is the context held?

Expand full comment

Depends on the application, if it's something local like LM Studio, the context is held in the client in-memory, and the whole conversation history is submitted to the LLM every time. If you're using a web-based chatbot, like ChatGPT or anything similar, the entire conversation history is stored in a regular database in the server, and you never directly interact with the model, you always go through an API call that builds the entire conversation, plus often injects a bunch of metadata (custom instructions, user profile info, uploaded documents, etc.) and then all that is sent to the LLM inference API under the curtain.

Expand full comment

Thanks. A better explanation than Copilot gave!

Yes, I was aware it injected extra text (as we all learnt from the African American founding father images that went viral a couple of years ago)

So it's a good idea to tell it when you start a new topic, otherwise it might all run together (not that I've ever noticed it doing so)?

Expand full comment

You should start a new conversation altogether when you start a new topic unless you explicitly want some of the previous conversation to bleed into the next one.

Expand full comment

Sometimes I even ask it to give a structured summary of what we've got so far, and then start a new conversation and just paste that, to kinda clean the conversation of all unrelated paths that were opened but didn't get us anywhere.

Expand full comment

I am starting out with oops and design patterns, would love your take on their relevance and key concepts to master. A challenge that I am particularly facing is that the examples in most of the tutorials are toy example of lets say an animal class but I would like to see how these ideas are implemented in real tangible manner; for this I am going over some data science based open source libraries such as sklearn. Do you think I am in the right direction, would you like to add any other activity or resources that I should consider, thanks a lot.

Expand full comment

Great question! Design patterns don't get anywhere near their deserved attention neither in formal nor informal software engineering education. And yes, as you say, a big part of that is the use of contrived or toy examples. I've seen great use of design patterns in the wild in Python libraries like FastAPI, for example: great use of dependency injection. Scikit-learn is a good example of a neatly designed API although I personally think the whole estimator/transformer abstraction is a subtle violation of single responsibility. You get both training and inference lumped into the same abstraction. At some time I may write about this. So, in terms of advice, just read the source code of great Python libraries (or great libraries in any language).

Expand full comment