This post is for the beginners. For the fresh graduates and junior developers who are just starting to think about performance. These are the things I wish someone had told me early in my career. Consider it a crash course in the fundamentals — practical, opinionated, and grounded in real-world software.
"When I was a child in Sri Lanka, I ended up memorizing the landline numbers of all my close relatives. To this day I remember them. The moment I got a phone where my contacts could be saved, I stopped remembering numbers."
If we look at it rationally, the phone numbers were an extra complexity layer introduced by technology. The smartphones solved that problem rightly so. You just have to member the person's name, rightly so.
> If we look at it rationally, the phone numbers were an extra complexity layer introduced by technology
I don't think that's any more rational than suggesting smartphones supplanted memory training; the phone number is an implementation detail, the lack of practicing memorization of important information is a general case. Smartphones created a dependency on themselves and solved problems that mostly weren't problems, or were often tertiary optimization problems before they came along, while phone technology actually solved a fundamental problem with high-latency communication via mail.
If I ask myself whether I'm generally better off having my contact numbers in my smartphone vs before—which itself is a fictitious premise, since mobile phones had them before they got smart—the answer is definitely "no", because the distribution of people I call isn't so varied as to make memorizing them difficult, but my lack of inclination to do even that means I don't remember the most common case and always need to have the phone or I'm screwed.
It's hilarious to watch this play out with drivers who are entirely dependent on a Maps app for directions in their own city. They don't remember basic routes, address blocks, can't even do it sometimes without the phone speaking it to them. It wasn't really a problem before, you'd just figure it out most of the time, or ask someone for an approximation.
I would use Deep Research mode outputs. Sometimes I run multiple of these in parallel on different models, then compare between them to catch hallucinations. If I wanted to publish that, I would also doublecheck each citation link.
I think the idea is sound, the potential is to have a much larger AI-wikipedia than the human one. Can it cover all known entities, events, concepts and places? All scientific publications? It could get 1000x larger than Wikipedia and be a good pre-training source of text.
Covering a topic I would not make the AI agent try to find the "Truth" but just to analyze the distribution of information out there. What are the opinions, who has them? I would also test a host of models in closed book mode and put an analysis of how AI covers the topic on its own, it is useful information to have.
This method has the potential to create much higher quality text than usual internet scrape, in large quantities. It would be comparative analysis text connecting across many sources, which would be better for the model than training on separate pieces of text. Information needs to circulate to be understood better.
reply