Smart Contract Testing and Proofs-of-Concept
Tests are not just an artifact developers ship — they are an audit deliverable on both sides of the engagement. Going in, a comprehensive test suite is one of the strongest signals of a mature codebase, and it gives auditors a working harness to extend. Coming out, every meaningful finding should be accompanied by a runnable proof-of-concept: a test that fails before the fix and passes after.
The Testing Pyramid for Smart Contracts
Smart contract testing borrows its structure from the classical testing pyramid, with adaptations for the constraints of the EVM:
- Unit tests — exercise individual functions in isolation. Fast, deterministic, the foundation everything else rests on.
- Integration tests — exercise multiple contracts together, often against mainnet forks, with real external dependencies (tokens, oracles, AMMs) in the loop.
- Fork / scenario tests — replay or simulate real on-chain conditions: specific block heights, real liquidity, real oracle prices, real attacker behavior.
- Fuzz tests — feed random inputs to a function (stateless) or random call sequences to a system (stateful) to find invariant violations the developer did not anticipate. Covered in depth in §4.8.
- Invariant tests — assert that a property holds across all reachable states; combined with stateful fuzzing or formal verification.
- Formal verification — prove that an invariant holds. Covered in §4.9.
A well-tested protocol has all six layers; most audits encounter codebases with strong coverage at layers 1–2, partial coverage at 3–4, and aspirations toward 5–6.
Coverage Is Necessary but Not Sufficient
forge coverage and equivalent tools report which lines of code were executed by the test suite. Auditors should look for two things:
- Lines that were not executed. Anything below ~95% line coverage on security-critical paths warrants a question. Anything below ~80% on the contract as a whole is a finding in itself.
- Branches that were not executed. Line coverage that comes from happy-path tests only is misleading. A function whose
revertbranches are never exercised has not really been tested.
Beyond coverage, the right question is mutation coverage: if you intentionally break the contract (off-by-one, flipped comparison, removed require), does the test suite catch it? Tools like slither-mutate and vertigo-rs automate this. A test suite that passes against mutants of the production code is not testing what it claims to test, regardless of its line coverage.
What This Section Covers
The subsections that follow walk through the testing layers auditors care about most:
- Unit Testing — using and extending the project's unit tests to understand behavior and probe edge cases.
- Integration Testing — multi-contract scenarios, mainnet forks, and the interactions that unit tests cannot capture.
- Creating Proofs-of-Concept — turning a suspected vulnerability into a reproducible, undeniable demonstration that the report's reader can run themselves.
Fuzzing, invariant testing, and formal verification each have their own chapters (§4.8 and §4.9).
Tests as Communication
A final note: tests are also documentation. A well-named test (test_revertWhen_NonOwnerCalls_setFee) communicates intent in a way a comment never quite can — because it is executable and stays in sync with the code. Auditors should read the test suite as a specification of what the developers believe the system does, then look for the gap between that belief and what the code actually does. That gap is where the findings live.