Chapitre 7

Stabilization and Test Ground

Consolidating practices and stress-testing them on real projects.

Published on November 20, 2025

Consolidation phase (summer-late 2025)

Among the projects I was running then, three of them show how I apply the frame (context, rules, tools) in daily work:

a personal web app (SaaS-like) that became a real AI test ground;
an orchestration project used as a crash test for over-engineering;
my system configuration (dotfiles) and global AI configuration, deployed across my machines and servers.

This period settled my posture: frame, delegate, verify. I started to feel like a kind of conductor.

A full-stack application as an architecture testbed

This web app is my test ground. I wanted a frame where an AI could intervene without breaking everything.

I focused on three simple moves: make architecture explicit, define contracts between modules, and strengthen tests (not only “does it work?”, but “does it respect the architecture?”).

To help the AI, I constrained context. Sometimes I give one layer (all business logic), sometimes one complete feature from front-end to back-end. That keeps context small and avoids collateral damage.

I was not trying to make things “clean” for perfectionism. I wanted a setup where AI can propose changes, where CI and tests act as filters, and where architecture quickly tells us when a proposal goes off the rails.

This project became my architecture lab. It is where I checked whether my ideas on context, rules, and quality hold up in a real codebase.

Dotfiles and global configuration: from magic to rules

Alongside the web app and orchestration work, I reviewed everything that drives my AIs day to day: my system configuration (dotfiles such as .bashrc or .gitconfig) and my global agent configuration at system level.

At first, it was a stack of hooks and scripts. It worked while I was the only operator, but it was fragile to maintain. I switched to presets: ready-to-use configs that bundle tools, permissions, and context by task (MCP, sandbox, permissions, and so on). This lets me launch an agent with the right access level without manually reconfiguring each session.

These experiments also exposed MCP limits. To measure that objectively, I built a small plugin to track token cost when opening a new session (or sub-agent). Result: 15-20% of the budget was gone from the start, and it was not the system prompt. Most of the weight came from stacked MCPs, around 36,000 tokens out of 200,000.

I moved to on-demand loading: only the MCPs needed for the task. My launchers pick predefined profiles depending on the need:

Base (0 MCP): no MCP loaded (~0 tokens);
Minimal: Context7 (~1,300 tokens);
Research: Context7 + Exa (~3,000 tokens);
Front-end 1: ShadCN (~5,500 tokens);
Front-end 2: Context7 + ShadCN + Playwright (~14,000 tokens);
Other profiles: back-end, testing, exploration (Firecrawl, etc.), loaded on demand.

I kept advanced menus optional and fixed a few UX details to remove friction.

I also set a shared rule for all agents at system level. This is my global base, the one that applies everywhere.

Important: this cleanup does not replace project context files (AI.md). It complements them at system level. It is the base projects rely on.

The direction stayed the same: avoid stacking logic that is hard to maintain, and prefer simple, visible, documented rules that clearly define what is out of bounds.

I also formalized a few collaboration rules: check docs before inventing an API, do not guess when information is missing, avoid “magic” estimates, stay focused, and mark the rest with TODO/FIXME instead of fixing everything along the way.

These are not just style preferences. They are guardrails against overconfident hallucinations, scope creep (AI trying to fix everything at once), and noise (comments or metrics that create an illusion of control).