Reference
AI Model Reference
Each skill in the Playbook is mapped to a model maturity level — L1 through L5 — that describes the kind of AI collaboration the skill requires. This page maps those levels to specific models so you can choose the right tool for the work.
Capability levels reflect what a model can do in a sustained workflow, not just its benchmark scores. Free tier availability is noted where relevant — cross-vendor adversarial review is easier when it costs nothing.
Reasoning
Works through multi-step problems with guidance. Can analyse, summarise, and draft when given clear direction at each step.
Fast, capable, and free. A strong choice for light drafting tasks and adversarial review cycles where cost matters.
Cost-effective with solid instruction-following. Well-suited for research synthesis and lighter review cycles.
Agentic
Executes structured workflows with human oversight. Handles drafting, structuring, and critique across a full session with minimal per-step instruction.
The balanced choice for sustained creative and analytical work. Strong instruction-following, long context, and natural voice.
Versatile and widely deployed. Effective across full workflow cycles; strong for adversarial review from a different vendor.
Exceptional long-context handling with a generous free tier. A strong choice for adversarial review of longer pieces.
Highly capable with API credits available. An independent model family perspective — valuable for cross-vendor adversarial review.
European-hosted with a free tier. Strong reasoning and a distinct model family — useful when data residency matters or for cross-vendor review.
Autonomous
Plans and executes complex multi-step tasks with minimal human intervention. Extended reasoning, self-correction, and sustained autonomy over long tasks.
Extended reasoning with high editorial judgment. Appropriate for complex, high-stakes content where depth and nuance matter most.
Strong extended reasoning. Well-suited to adversarial review requiring structured critique and logical depth.
Open-weight reasoning model with chain-of-thought. A strong open-source alternative at the extended reasoning tier.
Google's most capable model for complex, multi-step tasks requiring sustained reasoning and broad knowledge synthesis.