hands on is the only test that proves it
Last Thursday I shipped Phase 4 of NORD v2. Three hermetic test suites came back green. A hundred and sixty-three checks total, all passing. By the test runner's reckoning, the phase was done.
It is not.
What the hermetic tests prove is that the unit under test does the thing it was built to do, in isolation, against fixtures the test process controls. What they do not prove is that the unit does the thing it was built to do against infrastructure the test process does not control. The real authentication provider. The real network path. The real database evaluating the actual security rules under a real authenticated user.
For Phase 4 those are not minor details. The whole point of the phase was multi-tenant isolation. Agents on one organization cannot see or write to another organization's data, enforced at the database level through row-level security tied to the identity in the user's JWT. The hermetic tests confirm the policies are written correctly. They confirm the migrations apply cleanly. They do not confirm that the production-equivalent auth provider issues the JWT in the shape the policies expect.
That last sentence is the gap.
To close the gap I wrote a second test harness. Seven assertions, all run against a real Supabase project, authenticated as a real user, hitting the real database endpoint with the anonymous public key. Never the service key. The service key would bypass the very policies the harness is trying to verify. The point is to make the policies do their job against an unprivileged caller, exactly the way they will in production.
The seven assertions are simple. Member can see their own organization's data. Member cannot see another organization's data. Member cannot write to another organization's data. Admin can see their organization's data. Admin cannot see another organization's data. The fail-safe default rejects an unknown role. The custom JWT claim is being honored by the policies and not the default database role.
The first six are the security model. The seventh is the proof that the auth hook is wired correctly.
Hermetic tests prove the unit. Live-fire tests prove the integration with infrastructure the test process does not own. The two are necessary in sequence. Hermetic catches the bugs in the code I shipped. Live-fire catches the bugs that emerge from the seams between systems, where a misconfigured auth hook or a missing environment variable can take a green build to a real-world failure without changing a single line of code.
If the only test that matters is the one I can run on my laptop, the test I cannot run on my laptop is the one I am going to ship a bug through.
Phase 4 is hermetically complete. The live-fire harness is written and dev-tested. The phase will not be production-shipped until the harness runs green against the live Supabase project. That is not extra caution. That is what shipped actually means.
Get new posts by email
Friday digest, no filler. Drop your email below and I'll send what I publish.