More than a year ago I built my own coding agent called Claudine. I also made agentic anthropic-sdk-kotlin and few other AI libraries for the ecosystem. Maybe this low-level experience allows me to use these tools to deliver in 2 days what would have taken 2 months before.
My advice - embrace TDD. Work with AI on tests, not implementation - your implementation is disposable, to be regenerated, tests fully specify your system through contracts. This is more tricky for UI than for logic. Embracing architectures allowing to test view model in separation might help. I general anything reducing cognitive load during inference time is worth doing.
Suboptimal choice. According to AutoCodeBench, for equivalent problem complexity, LLMs generate correct Kotlin code ~70% of the time versus ~40% for Python, and Go scores lower than Python. Kotlin can be executed as a script while providing super fast compilation phase next to evaluation phase, which is further reducing a chance of mistakes. I don't use tools anymore. I just let my LLMs output Kotlin script directly together with DSLs tailored to the problem space, reducing cognitive load for the machine. It works like a charm as a Claude Code replacement, not only coding autonomously in any language, but directly scripting DB data science, Playwright, etc., while reducing context window bloat.
1. LLMs excell at extracting facts from the context. Storing them as a subject-predicate-object relationships is "natural" for graph databases. Doing it right, so that this knowledge can be utilized more efficiently than any RAG, requires sophisticated context engineering, for example to avoid duplicates and keep consistent vocabulary for relationships, but it is totally achievable and the quality of automatically extracted knowledge can be almost spotless, especially if an LLM can also decide on generating parallel embeddings as a semantic search entry point for graph traversal.
2. Writing cypher queries is a job I would never like to have as a human. But LLMs love it, so that an agent can do an ad hoc data science for every single problem. Especially while being aware which criteria were used for graph construction. It is worth ditching things like MCP in favor of tool graph-like solutions. For this purpose I developed my own DSL which only LLM speaks in internally. The effects are mind-blowing.
I would support every political and social movement progressing us on the spectrum from patriarchy to matriarchy. In particular I would put pressure on the legal system of countries where women are still not allowed to inherit land and property.
The next biggest problems to tackle:
- the way we are producing proteins
- the way we are producing energy
Short term problems to address:
- adoption of cognitive AI in scientific research
I am building very potent autonomous AI agents now, so soon I will be able to unleash them to crunch all these problems, hopefully. :)
I am the author of Claudine, who presented the project during this meetup, and I am happy that it is inspiring others. Claudine is intended for educational purposes, so that anyone can easily build a powerful (e.g. Unix-omnipotent) autonomous agent. It is possible thanks to my work on:
Pretty neat how compact it is. I'm trying to poke around to see what the capabilities are, but, more importantly, I'm interested in the restrictions. In `ExecuteShellCommand` it seems that it's basically unrestricted. I think I want at least a naive safeguard like a whitelist of directories that it can act on.
My advice - embrace TDD. Work with AI on tests, not implementation - your implementation is disposable, to be regenerated, tests fully specify your system through contracts. This is more tricky for UI than for logic. Embracing architectures allowing to test view model in separation might help. I general anything reducing cognitive load during inference time is worth doing.
reply