Security Researcher's Toolbox

A modern auditor's toolbox is layered. No single tool finds the kinds of bugs that matter; the value comes from running several of them, in the right order, against the right parts of the code, and reading the output skeptically. This section catalogs the tools that have proven their worth in production engagements, grouped by what they do.

The first tool is the editor. An auditor reads orders of magnitude more code than a typical developer, often in unfamiliar codebases, and the editor is the lens through which all of that happens.

Visual Studio Code with the Solidity extension (Juan Blanco) and Solidity Visual Developer (tintinweb) is the de facto standard. The latter adds inline contract graphs, function signatures, and a useful "audit mode" with color-coded annotations.
Hardhat for VS Code and Foundry-aware extensions add inline test-running and compile diagnostics.
Remix remains useful for quick experiments, single-file PoCs, and exploring third-party contracts directly from Etherscan or Sourcify.
Cursor and Zed are gaining traction among auditors who want tighter integration with LLM assistants.
vim/neovim users typically pair vim-solidity with coc-solidity or nvim-treesitter and an LSP backed by solc.

A few productivity practices matter more than the editor choice:

Open the codebase in a fresh, project-scoped workspace so search results are not polluted.
Configure ripgrep / VS Code search to ignore node_modules, lib, out, cache, and other generated directories so cross-references are signal, not noise.
Use the editor's call-graph and "find all references" features aggressively — manually tracing every caller of a privileged function is one of the highest-yield audit activities there is.

Build and Test Frameworks

The auditor needs to compile the code, run its tests, and write new tests of their own. Two frameworks dominate:

Foundry (Forge, Cast, Anvil, Chisel) — Rust-based, Solidity-native tests, very fast, and the default for new audits. forge test, forge coverage, forge inspect, forge debug, the vm.* cheatcodes, and the built-in stateless and stateful fuzzer make it the auditor's primary harness.
Hardhat — JavaScript/TypeScript-based, dominant in older codebases and in projects with heavy off-chain integration. Comes with a strong plugin ecosystem (gas reporter, coverage, upgrades).

Both can be used in the same project; modern audit harnesses often write new tests in Foundry against a Hardhat-built codebase.

Static Analysis

Static analyzers read the source (or bytecode) without executing it and flag suspicious patterns. They are fast, cheap, and noisy — useful for the first pass.

Slither (Trail of Bits) — the workhorse. Over 90 built-in detectors, printers for inheritance graphs and function summaries, an extensible Python API, and slither-mutate for mutation testing. Run early; triage the high-confidence detectors first.
Aderyn (Cyfrin) — Rust-based static analyzer with a growing detector library; very fast and works well alongside Slither.
Wake (Ackee Blockchain) — Python framework that combines static analysis, fuzzing, and a custom detector DSL.
Solhint and ethlint — linters that catch style and minor correctness issues; useful pre-audit, less useful during the engagement itself.
Semgrep with Solidity rules — flexible pattern matching for custom heuristics you build up across engagements.

Symbolic and Dynamic Analysis

Symbolic executors and concolic tools explore execution paths the way an attacker would. They are slower than static analyzers and prone to path explosion, but they find classes of bugs static analysis cannot.

Mythril — symbolic execution with the Z3 SMT solver; particularly good at finding integer issues, unchecked calls, and access-control gaps. See the dedicated section in §4.6.
Halmos (a16z) — symbolic execution that runs Foundry tests as symbolic specifications. Lets you write a test_* function and have Halmos prove (or refute) it for all inputs.
hevm — symbolic execution engine from the DappTools lineage, used heavily for equivalence checking and FV proofs.
Manticore — older symbolic executor from Trail of Bits, still useful for some workloads.

Fuzzing

Fuzzers generate inputs and look for invariant violations. Modern Web3 auditing leans on fuzzing heavily — see §4.8 for full coverage.

Foundry's built-in fuzzer — both stateless (function testFuzz_*) and stateful (invariant tests with handlers).
Echidna (Trail of Bits) — Haskell-based property fuzzer with coverage-guided exploration, shrinking, and an optimization mode.
Medusa (Trail of Bits) — Go-based successor to Echidna with parallel execution and better performance on large codebases.
Diligence Fuzzing (Consensys) — cloud-hosted fuzzing service; the spiritual successor to MythX for fuzzing workloads.

Formal Verification

For invariants that must hold, formal verification gives mathematical proofs. See §4.9 for depth.

Certora Prover — the dominant commercial FV tool; CVL2 specifications, used by Aave, Compound, MakerDAO, and most large DeFi protocols.
Halmos — open-source, lightweight FV using symbolic execution of Foundry tests.
hevm symbolic — equivalence proofs and bounded model checking.
K Framework — academic but production-grade; used in the KEVM and IELE semantics.
SMTChecker — built into the Solidity compiler; limited but free and worth running.

Decompilation and On-Chain Analysis

Sometimes the source is not the source — verified bytecode and unverified contracts both need to be read at the EVM level.

Heimdall (Jon-Becker) — Rust decompiler that produces readable Solidity-like output; the current best-in-class open tool.
Dedaub Decompiler and Panoramix — web-based decompilers, useful for quick checks.
evm.codes — interactive opcode reference; essential for assembly review.
Etherscan / Blockscout / Sourcify — verified-source explorers; the read-and-write panels are useful for sanity-checking deployed state and parameters.
Tenderly — transaction simulation, debugging, and alerting; the gold standard for "what does this transaction actually do?"
Phalcon (BlockSec) — interactive transaction explorer with state-diff and call-tree views; excellent for post-mortems.

Reports and Triage

Cantina, Sherlock, Code4rena platforms — each provides finding-submission and triage interfaces; auditors should learn the one(s) used by their engagements.
GitHub Issues / Projects — fine for private engagements; pair with a shared template that captures severity, location, description, impact, recommendation, and references.
Markdown + PDF pipelines — most firms use a custom LaTeX or Pandoc template for the final report; the OpenZeppelin and Trail of Bits public reports are good models to study.

AI Assistants

Large language models have become a routine part of the auditor's workflow — used carefully.

GitHub Copilot, Cursor, Claude Code, ChatGPT — useful for generating boilerplate tests, explaining unfamiliar syntax, drafting finding descriptions, and brainstorming attack surfaces.
AuditAgent, AuditWizard, GPT-based triage tools — emerging products that scan codebases against known vulnerability patterns; treat their output as a noisy detector, not a verdict.

The cautions are real:

LLMs hallucinate confidently. Always verify a claimed vulnerability by reading the code yourself.
LLMs reproduce training data. Code you paste into a hosted model may end up training a future version; treat private codebases accordingly and prefer local or enterprise-tier deployments for sensitive work.
LLMs are weakest exactly where audits matter most: novel business logic, complex multi-contract interactions, and economic exploits. They are strongest at boilerplate.

Use them as a force multiplier on the boring parts of the job so you can spend your attention on the parts that matter.

A Suggested Workflow

A typical first day on a fresh codebase:

Open the repo in VS Code; install the Solidity extensions; configure search exclusions.
Build the project (forge build / npx hardhat compile); confirm tests run (forge test / npx hardhat test).
Run forge coverage (or equivalent) and note which contracts are under-tested.
Run Slither (slither .), Aderyn, and any other static analyzers; save the output for later triage — do not act on it yet.
Read the documentation and the deployment scripts.
Read the contracts in dependency order, building a function-permission matrix and a state-variable table as you go.
Now triage the static-analyzer output: most will be false positives, but the ones that survive scrutiny are the cheapest findings of the engagement.

From there, the methodology section (§4.5.2) takes over.

DF3NDR Web3 Security Books