3.7.6 Optimization Patterns with Security Trade-offs

The patterns in the prior five subsections were largely additive: applying them makes contracts safer at little or no cost. The patterns in this section are different. Each one trades a measurable amount of safety for a measurable amount of performance — gas savings, code size reduction, or new capability that Solidity does not expose. Every pattern here has produced production exploits when applied carelessly.

This section presents three optimization patterns where the security trade-off is real and the application is common enough to warrant explicit treatment: selector-based ABI decoding, assembly for performance-critical sections, and eth_call tricks for off-chain computation. Each pattern is shown in its idiomatic form, with the specific safety properties Solidity normally provides that the pattern bypasses, the conditions under which the trade-off is justified, and the foot-guns that have produced exploits.

The first principle of this section is the most important: do not apply these patterns until profiling shows you need to. A contract with measurably high gas costs justified by transaction volume is a candidate. A contract that "might be slow" or "should be optimized for production" is not. The cost of getting these patterns wrong vastly exceeds the cost of running unoptimized code; reach for them only when the data demands it.

For deeper coverage of the underlying mechanics, Section 4.10 (Master the EVM and Low-Level Programming) walks through Yul, assembly, and calldata analysis from the auditor's perspective. This section assumes that material and focuses on developer-facing patterns and their security implications.

ABI Decode with Selector

The standard Solidity pattern for receiving a typed function call uses the compiler-generated dispatcher: a switch on the first four bytes of calldata (the function selector) routing to typed parameter decoding for the matched function. This is automatic, safe, and reasonably efficient. It also has limitations that the manual selector + decode pattern overcomes.

The pattern is used in three situations:

  1. Generic dispatchers — receiving arbitrary calldata and routing to handlers based on the selector (used in diamond proxies, plugin systems, batch executors)
  2. Custom error decoding — extracting structured information from a revert that uses a custom error
  3. Selective decoding — decoding only the fields needed from a large calldata payload, rather than the full struct

Idiomatic Form: Generic Dispatch

A function that receives arbitrary calldata and routes to one of several handlers:

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;

contract Dispatcher {
    bytes4 private constant TRANSFER_SEL = bytes4(keccak256("transfer(address,uint256)"));
    bytes4 private constant APPROVE_SEL  = bytes4(keccak256("approve(address,uint256)"));

    error UnknownSelector(bytes4 selector);
    error MalformedCalldata();

    function dispatch(bytes calldata payload) external returns (bool) {
        if (payload.length < 4) revert MalformedCalldata();

        bytes4 selector = bytes4(payload[:4]);

        if (selector == TRANSFER_SEL) {
            // Manually decode (address, uint256) from payload[4:]
            if (payload.length != 4 + 32 + 32) revert MalformedCalldata();
            (address to, uint256 amount) = abi.decode(payload[4:], (address, uint256));
            return _handleTransfer(to, amount);
        }

        if (selector == APPROVE_SEL) {
            if (payload.length != 4 + 32 + 32) revert MalformedCalldata();
            (address spender, uint256 amount) = abi.decode(payload[4:], (address, uint256));
            return _handleApprove(spender, amount);
        }

        revert UnknownSelector(selector);
    }

    function _handleTransfer(address to, uint256 amount) internal returns (bool) {
        // ...
        return true;
    }

    function _handleApprove(address spender, uint256 amount) internal returns (bool) {
        // ...
        return true;
    }
}

This pattern lets the contract accept calls that look like ERC-20 calls without inheriting an ERC-20 interface — useful for permissioned proxies, replay-protection layers, and multicall implementations that need to inspect what's being called.

The Critical Pitfalls

Length validation is not optional. Skipping the payload.length != 4 + 32 + 32 check means abi.decode may succeed against malformed input that wouldn't pass the high-level Solidity dispatcher. For a fixed-shape function, the length is deterministic; check it explicitly. For dynamic-shape functions (with bytes, string, or dynamic arrays), the check is more involved — typically you accept any sufficient length and trust abi.decode to revert on inconsistency, which it does for well-formed encodings but can be coerced into partial successes for malicious encodings.

bytes4(payload[:4]) is not the same as the first 4 bytes of a function call. If payload is less than 4 bytes, this slicing panics. The length check must come first.

Selectors can collide. With only 32 bits of selector space and the ability to choose function names, an attacker can craft a function name that hashes to the same selector as a different function. The signature transfer(address,uint) and some adversarial signature can both produce the same 4-byte selector — this is a known property of Ethereum, not a bug. Generic dispatchers that route based on selector alone are vulnerable to selector squatting if any handler logic depends on the assumed identity of the caller's intent.

For trusted handlers (you wrote both sides), selector collisions are not a practical concern — you would notice the collision at compile time when two of your functions hash to the same selector. For untrusted callers passing arbitrary calldata, the risk is real and is one reason diamond proxies (which dispatch on selector) require careful facet design.

Idiomatic Form: Custom Error Decoding

When a contract reverts with a custom error, the revert data is (selector, encoded_args). To extract the args programmatically:

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;

contract ErrorAware {
    error InsufficientBalance(uint256 requested, uint256 available);

    function attemptCallAndExplain(address target, bytes calldata data) external returns (string memory) {
        (bool ok, bytes memory result) = target.call(data);
        if (ok) return "success";

        // Custom error: first 4 bytes are the selector
        if (result.length < 4) return "revert without data";

        bytes4 selector;
        assembly {
            selector := mload(add(result, 32))  // skip the bytes length prefix
        }

        if (selector == InsufficientBalance.selector) {
            // Decode the remaining bytes as the error's parameters
            (uint256 requested, uint256 available) = abi.decode(
                _slice(result, 4),
                (uint256, uint256)
            );
            return string(abi.encodePacked(
                "Insufficient: requested ",
                _toString(requested),
                ", available ",
                _toString(available)
            ));
        }

        return "unknown error";
    }

    function _slice(bytes memory data, uint256 start) internal pure returns (bytes memory) {
        bytes memory result = new bytes(data.length - start);
        for (uint256 i = 0; i < result.length; ++i) {
            result[i] = data[start + i];
        }
        return result;
    }

    function _toString(uint256 value) internal pure returns (string memory) {
        // ... standard uint-to-string implementation
    }
}

This pattern is essential for protocols that need to handle structured errors from external contracts — e.g., a router that decodes a swap's failure reason and presents it to the user, or a debugging library that classifies common error types. OpenZeppelin's Errors.sol and similar libraries do exactly this.

The safety concern is the same as for generic dispatch: malformed revert data can produce confusing or incorrect decoded values. The pattern should always include a "selector is none of the known ones" fallback rather than assuming all selectors decodable.

When to Use This Pattern

  • Diamond proxies and plugin systems where dispatch on selector is the architectural pattern
  • Protocol routers that need to recognize wrapped or proxied function calls
  • Error-handling layers that translate custom errors into user-readable messages
  • Replay protection middleware that inspects what is being called before forwarding

For everyday function dispatch — a contract receiving its own typed function calls — Solidity's compiler-generated dispatcher is already optimal. This pattern is for cases where the calldata structure isn't known at compile time.

Assembly for Performance-Critical Sections

Inline assembly (assembly { ... } blocks containing Yul) gives direct access to EVM opcodes. The optimization wins come from skipping checks Solidity inserts automatically, accessing opcodes Solidity doesn't expose, and manipulating memory in ways that avoid the abstractions overhead. The cost is that every line of assembly is unchecked code — no overflow protection, no bounds checking, no implicit returns.

Three specific assembly tricks recur often enough to deserve named treatment. Each one bypasses a specific Solidity safety mechanism, and each has produced production exploits when applied without understanding the trade-off.

Trick 1: Efficient Array Length Read

Solidity reads dynamic array length from storage with several internal checks. For frequently-accessed length values, the savings of a direct sload are real but small.

function lengthSolidity() external view returns (uint256) {
    return items.length;  // ~2,100 gas
}

function lengthAssembly() external view returns (uint256 len) {
    assembly {
        len := sload(items.slot)
    }
    // ~100 gas
}

Safety surface bypassed: Solidity reads the length through its array representation, which for some storage variants (packed arrays, special compiler optimizations) may not equal the raw slot value. For standard dynamic arrays in Solidity 0.8+, the slot value is the length, and the optimization is safe. For custom storage layouts (packed structs, ERC-7201 namespaces), the slot may not contain what you think.

The judgment call: Save 2,000 gas per read in exchange for the maintenance risk that someone later refactors the storage layout without updating the assembly. For a frequently-called view function that's exposed to other contracts, the savings can be substantial across a year of usage; for a once-per-transaction internal call, the optimization is not worth the readability cost.

Trick 2: Skipping Overflow Checks for Provably-Safe Arithmetic

Solidity 0.8 inserts overflow checks on every arithmetic operation. For operations that are demonstrably safe — adding numbers that have been bounded by prior checks — the inserted check is wasted gas.

// Solidity 0.8 adds an overflow check here
function sumChecked(uint256[] memory arr) external pure returns (uint256 total) {
    for (uint256 i = 0; i < arr.length; ++i) {
        total += arr[i];
    }
}

// Bypass the check when the values are bounded
function sumUnchecked(uint128[] memory arr) external pure returns (uint256 total) {
    // Each element is at most 2^128 - 1; sum of (2^256 / 2^128) = 2^128 elements
    // would be required to overflow. The array length is itself bounded by gas.
    unchecked {
        for (uint256 i = 0; i < arr.length; ++i) {
            total += arr[i];
        }
    }
}

Safety surface bypassed: Solidity's automatic overflow revert. For the uint128[] case above, the math is genuinely safe — to overflow uint256 while summing uint128 values, you would need 2^128 elements, which exceeds the block gas limit by many orders of magnitude. The unchecked block is a deliberate, reasoned choice.

The foot-gun: The same code with uint256[] is not safe. Two large uint256 values can overflow. The optimization depends on the input type; changing the parameter type later (e.g., upgrading to uint256 arrays) silently makes the contract vulnerable.

The defensive habit is to write a comment explaining why the unchecked block is safe, and to verify with a Foundry test:

function test_sumUnchecked_cannotOverflow() public pure {
    uint128[] memory arr = new uint128[](type(uint16).max);  // 65k elements
    for (uint256 i = 0; i < arr.length; ++i) {
        arr[i] = type(uint128).max;
    }
    // The sum is at most 65535 * (2^128 - 1) = ~2^144, far below 2^256
    // This call should not revert (which it would if overflow occurred)
    uint256 total = this.sumUnchecked(arr);
    assertGt(total, 0);
}

The classical loop counter case is a special case of this pattern:

for (uint256 i = 0; i < arr.length; ) {
    // ... loop body
    unchecked { ++i; }  // i cannot overflow because i < arr.length and arr.length < 2^256
}

This is so universal that compilers and linters generally suppress warnings about it; it saves ~50 gas per iteration with no realistic risk.

Trick 3: Bypassing ABI Encoding Overhead

Solidity's ABI encoding/decoding has overhead for memory expansion, length encoding, and bounds checking. For known-shape data passed to known interfaces, assembly can construct the calldata directly:

// Solidity: ~150 gas for ABI encoding
function transfer(IERC20 token, address to, uint256 amount) external {
    token.transfer(to, amount);
}

// Assembly: ~50 gas, but no return value check
function transferAsm(address token, address to, uint256 amount) external returns (bool ok) {
    assembly {
        let ptr := mload(0x40)  // free memory pointer
        mstore(ptr, 0xa9059cbb00000000000000000000000000000000000000000000000000000000)
        mstore(add(ptr, 4), to)
        mstore(add(ptr, 36), amount)
        ok := call(gas(), token, 0, ptr, 68, 0, 0)
    }
}

Safety surface bypassed: Multiple, including:

  • Return data validation (the assembly version ignores what the token returns)
  • Compatibility with non-standard tokens (some tokens return no data; Solidity's high-level call handles this, raw call requires you to check returndatasize() yourself)
  • Forwarding revert reasons (raw call returns false on revert without propagating the reason)
  • Stack/memory state assumptions (the free memory pointer at 0x40 may have been modified by surrounding code)

The judgment call: For a payment router making thousands of transfers per transaction, the gas savings compound. For a single transfer in a typical user flow, the safety surface lost far exceeds the gas saved.

A common production compromise is OpenZeppelin's SafeERC20.safeTransfer, which uses assembly for efficiency but reintroduces the safety checks (return value validation, return data size handling) explicitly. The performance is competitive with raw assembly while preserving the safety properties. For any case where you would reach for raw assembly to call an ERC-20, use SafeERC20 instead — it has the assembly written correctly already.

Reading and Auditing Assembly

When assembly is used, several practices reduce the chance of latent bugs:

  1. Comment every line. What opcode it maps to, what stack effect it has, what memory it touches. Assembly is write-once-read-never if it isn't documented.

  2. Use named labels for memory locations. Foundry's forge and recent Solidity versions support named arguments in Yul:

    assembly {
        let dest_ptr := mload(0x40)
        mstore(dest_ptr, selector)
    }
    

    This is much easier to follow than bare hex offsets.

  3. Match every memory write with a documented length. Memory is a contiguous region; overwriting bytes you didn't intend to is a class of bug that does not exist in high-level Solidity but absolutely exists in assembly.

  4. Test with fuzzing. Section 4.8 covers fuzzing in depth. Assembly code is exactly the kind of code where edge cases hide — fuzzing finds the inputs that high-level reasoning misses.

  5. Slither's assembly detector flags inline assembly automatically. This is a flag, not a finding — the question for the reviewer is whether the use is justified.

eth_call Tricks for Off-Chain Computation

The eth_call RPC method executes a transaction as if it were submitted, returning the result without paying gas or modifying state. This makes it a powerful tool for off-chain inspection: you can execute arbitrary code against the live chain state, observe the result, and use that result in your application without ever sending a transaction.

The optimization pattern uses this property in reverse: deploy a "view-only" contract that performs expensive computation, then call it via eth_call to get the result. Because the call never lands on-chain, the gas cost doesn't matter — you can run computations that would be prohibitively expensive in a real transaction.

Idiomatic Form: View-Only Aggregator

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;

contract PoolStateAggregator {
    struct PoolSnapshot {
        address pool;
        uint256 reserve0;
        uint256 reserve1;
        uint256 totalSupply;
        uint256 lastBlock;
    }

    function snapshotAll(address[] calldata pools) external view returns (PoolSnapshot[] memory) {
        PoolSnapshot[] memory out = new PoolSnapshot[](pools.length);

        for (uint256 i = 0; i < pools.length; ++i) {
            IPool p = IPool(pools[i]);
            out[i] = PoolSnapshot({
                pool: pools[i],
                reserve0: p.reserve0(),
                reserve1: p.reserve1(),
                totalSupply: p.totalSupply(),
                lastBlock: p.lastUpdateBlock()
            });
        }

        return out;
    }
}

A dApp wanting to display the state of 100 pools would naively make 400 RPC calls. With this aggregator, one eth_call returns all 400 values in a single response — orders of magnitude faster, and no contract is actually deployed beyond the aggregator.

Multicall3 (the deployed contract at 0xcA11bde05977b3631167028862bE2a173976CA11 on most EVM chains) is the universal version of this pattern. Almost every modern dApp uses it for batch RPC calls.

The Code-on-the-Fly Pattern

A more advanced variant: deploy the aggregator contract during the eth_call itself. This works because contract deployment is just code execution; if the deployment reverts after constructing the result, the deployed contract is discarded but the constructor's return data is returned to the caller.

// Off-chain JavaScript
const aggregatorBytecode = "0x608060...";  // bytecode of an aggregator that returns data in its constructor
const result = await provider.call({
    data: aggregatorBytecode + encodedArgs.slice(2)
});
// result contains the data the constructor "returned" (via revert with abi-encoded payload)

The pattern works because eth_call runs a transaction with arbitrary calldata. By passing contract bytecode as the call data, the EVM treats it as a contract creation; the constructor runs; the constructor can use assembly { return(...) } to return data; that data comes back as the call result.

Uniswap V3's QuoterV2 uses this pattern. The Quoter contract has a quoteExactInputSingle function that performs a simulated swap and reverts at the end with the result encoded in the revert data. Off-chain, the caller decodes the revert reason to extract the simulated swap price. The pattern lets the quoter accurately price a swap without writing any state — which would be prohibitively expensive and would require the user to actually execute the swap.

The Critical Trade-off: Don't Trust eth_call Results On-Chain

The single most dangerous misunderstanding of this pattern is using its output as on-chain input.

eth_call runs against the latest pending state visible to the calling node. By the time a transaction based on that state is mined, the state may have changed — possibly because an attacker arbitraged the difference. Using eth_call to compute "the price I should accept for this swap" and then submitting a transaction that trusts that price is the canonical sandwich-attack setup.

Defenses:

  • Slippage parameters. The user computes the price via eth_call, then submits a transaction with min_out set to expected_price * (1 - slippage_tolerance). The on-chain transaction reverts if the actual price has moved beyond tolerance.
  • TWAP or block-bound prices. The on-chain logic does its own price check, not trusting any off-chain computation.
  • Commit-reveal. Section 3.7.4 covers this pattern, which fully removes the off-chain price from on-chain decision-making.

eth_call is correctly used for display (what should I show the user?), probing (does my transaction succeed before I pay for it?), and aggregation (give me 100 state values in one call). It is incorrectly used for authority (the price eth_call returned is the price I'm going to accept on-chain).

When to Use This Pattern

  • Multi-call aggregation for dApp UIs: use Multicall3 directly; don't reinvent.
  • Simulating transactions to provide UX previews (gas estimates, expected outcomes)
  • Quoters and routers that need to compute prices via complex state access
  • Block explorers and indexing tools that need rich state queries

Avoid when:

  • The result will be used as input to another transaction without slippage protection
  • The off-chain caller is untrusted (the simulation result is influenced by node-specific state)

Composition and Avoidance

Unlike the prior sections, these patterns do not compose into a unified example. They are independent optimization tools, each with its own justification. The composition rule is the opposite: avoid stacking them. A function that uses assembly to skip overflow checks and a selector dispatcher and depends on an eth_call-computed input has compounded its security surface in three independent ways. Each optimization should be justified independently and applied only where the gas savings clearly exceed the risk.

The mature approach to optimization in production smart contracts:

  1. Measure first. Use forge test --gas-report or forge snapshot to identify the actually expensive operations. Most contracts have one or two hotspots that dominate gas costs; everything else is irrelevant.

  2. Try high-level optimizations first. Caching storage reads in memory, reducing the number of SSTOREs, restructuring loops to avoid repeated calculations — these get you 80% of the gas savings with none of the safety risk.

  3. Use audited libraries before raw assembly. SafeERC20, OpenZeppelin's Math, BitMaps, Strings — these have the assembly written by people whose full-time job is writing safe assembly. Inherit their work.

  4. Reach for raw assembly only when (3) doesn't cover the case. And when you do, document the safety reasoning, write fuzz tests, and have it specifically reviewed.

Quick Reference

PatternBypassesUse whenCommon foot-gun
Selector + manual decodeSolidity's automatic dispatcher and type checkingDiamond proxies, generic routers, custom error decodersSkipping length validation; assuming selector uniqueness
Assembly: efficient SLOADStorage representation abstractionFrequently-read view functions on stable storage layoutsStorage layout refactored without updating assembly
Assembly: unchecked arithmeticOverflow/underflow revertsProvably-bounded arithmetic (loop counters, packed-type sums)Input type changed; bound no longer holds
Assembly: raw call constructionABI encoding overhead, return-value checkingPerformance-critical token transfer paths in batch processorsReturn data not validated; non-standard tokens not handled
eth_call aggregationPer-call RPC overheaddApp UIs displaying many state values; transaction simulationTrusting the result on-chain without slippage protection

Cross-References

  • Low-level mechanics — Section 4.10 (Master the EVM and Low-Level Programming) covers Yul, assembly, calldata structure, and the EVM in depth
  • Gas optimization in context — Section 3.6 covers gas optimization techniques and trade-offs at a broader level
  • Auditing assembly — Section 4.10.3 covers the auditor's perspective on inline assembly
  • Fuzzing for assembly correctness — Section 4.8 covers stateful and stateless fuzzing, the right testing strategy for assembly-heavy code
  • Slippage and price protection — Section 3.11.3 (MEV mitigation) covers slippage parameters and other defenses against the eth_call trust pitfall
  • SafeERC20 — OpenZeppelin's reference implementation of "assembly that's been done correctly"; used throughout the book's code examples