For a long time I was curious if Claude Code works so well because of Claude (the model) or Code (the CLI tool / agent). This weekend, I tried to find out. Turns out that both matter, but more than anything post-training fine tuning of the model makes a big difference. If the model has been tuned for planning and tool usage in a certain way, it would provide much more reliable results.
Details as in https://nevkontakte.com/2025/swap-ai-brains.html