Be aware the haystack test is not good at all (in its current form). It's a sing...

bufferoverflow · on March 28, 2024

Seems like a very good test for recall.

aubanel · on March 28, 2024

Even in the most restrictive définition of recall as in "retrieve a short contiguous piece of information inside an unrelated context", it's not that good. It's always the exact same needle inserted in the exact same context. Not the slightest variation apart from the location of the needle.

Then if you want to test for recall of sparse information or multi-hop information, it's useless.

BA2255 · on April 11, 2024

For my education, how do you use the 200k contenxt the normal chats like Poe, or chatgpt don't accept longer than 4k maximum. Do you use them in specific Playgrounds or other places?

aubanel · on April 15, 2024

The calls with long context are done through specific APIs, that you can call for instance in Python or Javascript.

Here's a quick start guide with OpenAI: https://platform.openai.com/docs/quickstart?context=python