Exactly! We are talking about calling an API here people. Maybe what is behind t...

tbrownaw · on June 21, 2023

Part of that diagram is about fine-tuning / retraining. Part is about managing canned prompts, since those matter enough to have their own development cycle. Part is about caching, since the fancier models are very expensive. Part is about filtering the output to not upset people, which is built in to the hosted versions (currently called "AI safety"). Etc.

Doing everything in that diagram is probably overkill for most uses. But using it as a starting point and trimming what you don't need will help a bit with triping over "oops I forgot to include that".

adamgordonbell · on June 21, 2023

Makes sense.

But caching, for instance, doesn't need to be it's own lib does it? I don't want 'semantic caching' I want to cache the exact same query and I can do that without being LLM specific:

    from joblib import Memory

    @memory.cache
    def call_chat_completion_api_cached(max_tokens, messages,temperature):
    ...

I mean, I guess then I might want to store that somewhere central like redis and maybe slowly I need a specific cache tool. So I get your point. It's helpful to see the possibilities of approaching these problems.

But it also does feel like an land grab of supporting libs and infrastructure.