(Scope 1) From 2024-04-05
To 2024-04-10
(Scope 2) From 2024-04-22
To 2024-04-26
(Scope 3) From 2024-04-22
To 2024-04-24
For Scope 1, we audited the ZKsync-Association/zk-governance repository at commit a3e361d. The following files were in scope:
zk-governance/l1-contracts/src
├─ Guardians.sol
├─ Multisig.sol
├─ ProtocolUpgradeHandler.sol
├─ SecurityCouncil.sol
└─ interfaces
├─ IGuardians.sol
├─ IProtocolUpgradeHandler.sol
├─ ISecurityCouncil.sol
└─ IZkSyncEra.sol
As for Scope 2, following the scope above, another diff audit of the ZKsync-Association/zk-governance repository at HEAD commit c66fc0b and BASE commit 3e20e0c was performed. The following files were in scope:
zk-governance/l1-contracts/src
├─ EmergencyUpgradeBoard.sol
├─ Guardians.sol
├─ Multisig.sol
├─ ProtocolUpgradeHandler.sol
├─ SecurityCouncil.sol
└─ interfaces
├─ IPausable.sol
├─ IProtocolUpgradeHandler.sol
└─ IStateTransitionManager.sol
Further, in Scope 2, we diff audited the matter-labs/era-contracts repository at HEAD commit 189a7fe and BASE commit a86cf76 regarding relevant changes to the governance process
era-contracts/l1-contracts/contracts/
└─ state-transition
├─ IStateTransitionManager.sol
├─ StateTransitionManager.sol
├─ ValidatorTimelock.sol
└─ chain-deps/facets/
├─ Admin.sol
└─ Mailbox.sol
Lastly, in Scope 3, we diff audited the following files with the same commits as in Scope 2 above:
era-contracts/l1-contracts/contracts
├── bridge
│ ├── L1ERC20Bridge.sol
│ ├── L1SharedBridge.sol
│ └── interfaces
│ ├── IL1ERC20Bridge.sol
│ └── IL1SharedBridge.sol
└── bridgehub
└── Bridgehub.sol
Update: All the resolutions mentioned in this report are contained at commit d2acc8f of ZKsync-Association/zk-governance repository.
This upgrade aims to decentralize the protocol upgrades of zkSync Era and introduce incident response measures. The details of the new contracts and changes are outlined in the following sections.
ProtocolUpgradeHandler
The proposal life-cycle starts with a Governance token vote on L2. On successful vote, the proposal is carried over to the ProtocolUpgradeHander
on L1. Here, the upgrade proposal needs to be initialized by checking whether the proposal was indeed successful on L2. This is performed through a log inclusion check in the Mailbox
facet. After initializing the proposal to the Waiting
state, the proposal is now progressing under the influence of the trusted parties Security Council (SC) and Guardians (G). The following state transitions can happen:
Waiting
state
VetoPeriod
Waiting
and after 90 days the proposal moves to ExecutionPending
Canceled
VetoPeriod
state
Canceled
ExecutionPending
immediatelyExecutionPending
ExecutionPending
state
Ready
Ready
state
Done
The Security Council is a 6/12 multisig and the Guardians is a 5/8 multisig.
In the event of suspicious activity, the security council can perform a soft or hard freeze through the ProtocolUpgradeHandler
. Either freeze will pause the BridgeHub
, shared bridge, and all hyperchains of the StateTransitionManager
(which is owned by the ProtocolUpgradeHandler
). The freeze modes differentiate by pause duration and cooldown time:
The security council can unfreeze everything anytime. Alternatively, anyone can unfreeze once the freeze period ends.
In case of a critical security issue which needs to be addressed promptly, an emergency upgrade can be executed. This upgrade follows the usual proposal structure, but bypasses the whole proposal process outlined above for direct execution. It can only be executed by the Emergency Upgrade Board, which requires approval of the 9/12 security council, 5/8 Guardians, and ZK Association multisig. An emergency upgrade can be executed at all times (frozen or not), unfreezes the contracts right away, and resets the freeze timers. Thus, a hard freeze and emergency upgrade could be followed by another hard freeze.
Parts of the bridge architecture are described in our Fee Model and Token Bridge audit report. The following is a summarized overview of the contracts:
L1ERC20Bridge
Previously, the L1ERC20Bridge
did not support ETH transfers. However, it was possible to send an arbitrary amount of ETH along with any ERC20
token deposit. This contract was deprecated in favor of the L1SharedBridge
. The L1ERC20Bridge
is still needed, for backward compatibility with already integrated projects.
L1SharedBridge
The L1SharedBridge
is an upgradeable contract that helps bridging assets between L1 and hyperchains, supporting both ETH and ERC20
tokens.
BridgeHub
The BridgeHub
acts as a hub for bridges and hyperchains, allowing them to communicate with all hyperchain contracts from a single point and gathering all L1 assets to the L1SharedBridge
.
The following changes were introduced within the era-contracts scopes:
L1SharedBridge
and Bridgehub
contracts.{un}freezeDiamond
functions were set to be for the StateTransitionManager
only.StateTransitionManager
admin may now also set validators. Only the owner was previously allowed to do so.StateTransitionManager
was changed to internally handle the mapping of chain-ID to addresses using OpenZeppelin's EnumerableMap
library. This includes extending the interface with functions to query the IDs and addresses.During the audit, the auditors have identified the following considerations:
Multisig
contracts do not implement a replayability measure themselves (despite for the (un)freezing). Instead, they are dependent on the ProtocolUpgradeHandler
implementation they interact with, which prevents replayability. However, if these interactions get extended in the future the replayability counter-measure needs to be re-evaluated.Ready
state, while their order of execution is not enforced. Thus, it is expected by the Guardians and Security Council that the order of execution is as intended. Note that the feature of a predecessor upgrade was removed compared to the previous Governance implementation.L1SharedBridge
and Bridgehub
is callable by the owner of those contracts, which is set as the ProtocolUpgradeHandler
contract. Upon a soft or a hard freeze, the ProtocolUpgradeHandler
would pause the L1SharedBridge
, Bridgehub
, and all the hyperchains.L1SharedBridge
, the owner can pause bridging/depositing, claiming, and withdrawing functions. While pausing was not added to the L1ERC20Bridge
, the functions deposit
, claimFailedDeposit
, and finalizeWithdrawal
rely on their respective counterparts from the L1SharedBridge
. This means that when L1SharedBridge
is paused, the L1ERC20Bridge
contract is also paused.Bridgehub
, the functions createNewChain
, requestL2TransactionDirect
, and requestL2TransactionTwoBridges
are now pausable. This means no new chains can be added, and no cross-chain communication can happen through the BridgeHub
.ProtocolUpgradeHander
, the hyperchains, and bridges.Update: Please note that the term 'hyperchain' has been updated in later releases to 'ZKchain'.
As of Solidity 0.8.21, immutable variables do not need to be explicitly assigned a value in the constructor of the contract. This could lead to cases where contracts are deployed with immutable variables unintentionally set to their default values.
Within the ProtocolUpgradeHandler
contract, the immutable state variable ZKSYNC_ERA
was not initialized in the constructor. This basically makes the whole contract inoperable as no protocol upgrades can be processed through the startUpgrade
function, which is dependent on this address.
Consider setting the correct address during contract construction.
Update: Resolved at commit 348a26a.
Multisig
Is Incompatible with ERC-4337EIP-4337 is a standard that allows smart contracts to behave like user accounts, thereby extending the user account landscape of Externally Owned Accounts (EOA) with smart contract accounts.
To verify the signatures, the Multisig
contract uses the OpenZeppelin ECDSA
library that makes a call to the ecrecover
precompile contract, which is incompatible with smart contract accounts.
Hence, to enable EIP-1271 smart contract account signature checks, consider using the SignatureChecker
library instead.
Update: Resolved at commit fef52f6. Consider using an EnumerableSet
for the multisig members, to check the signers against the members and give a meaningful revert message otherwise. The ZKsync Association team stated:
The proposal makes sense to me, but I am in favour of keeping dependency list as small as possible, especially when there is no strong reasons to add new one. I will acknowledge this issue for now.
ProtocolUpgradeHandler
Design FlawsThe ProtocolUpgradeHandler
is the state machine of processing the Security Council's and Guardian's say on a protocol upgrade proposal. The Council can approve an upgrade and the Guardians can approve as well as veto an upgrade. These calls are dependent on different states that an upgrade can process through.
The following was identified as design flaws when it comes to handling the state of an upgrade:
approveUpgradeGuardians
and refrainFromVeto
indicate the same intention of the Guardians approving the upgrade. However, approveUpgradeGuardians
can only be called in Waiting
state and refrainFromVeto
only in VetoPeriod
state, thereby making the multisig operations more complex to handle.Waiting
state), but then with the Security Council's approval the VetoPeriod
state is entered which allows the Guardians to veto the previously approved upgrade, hence, making the upgrade process ambiguous.approveUpgradeGuardians
, but the Security Council calls approveUpgradeSecurityCouncil
just before them, then the state changes from Waiting
to VetoPeriod
, thus, making the Guardians' call fail that requires the state to be Waiting
.veto
overrules them. They can only veto in the VetoPeriod
state however, which is only entered after the Council's approval. This dependency does not make sense given the prior statement of the Guardians' last say, i.e., the Guardians' should be able to cancel (veto) an upgrade in Waiting
state.Consider re-evaluating the design choices of this contract. For example, instead of setting the state directly in the UpgradeStatus
, set the party's choices that the state will be evaluated from. This could make the state evaluation more streamline and remove the redundancy outlined above.
Update: Acknowledged, will resolve. The ZKsync Association team stated:
Acknowledged. The current design was chosen to eliminate additional complexity related to the proposed approach. We may reevaluate the decision if the flow will change.
EnumerableMap
The StateTransitionManager
was refactored to handle the chain ID to contract address mapping of the hyperchains using OpenZeppelin's EnumerableMap
library.
This has introduced an issue in the getAllHyperchains
function, where the chain IDs are saved in the keys
array to query the addresses from the EnumerableMap
using the get
function. However, the get
function expects the input key[i]
instead of the given index i
, which effectively leads to a revert and renders the function unusable. Consider using either the at
function to query the addresses by index, or using the proper key from the array with get
.
Furthermore, while not posing an issue, the library usage around the getHyperchain
function might not fully comply with the libraries intention. To query a hyperchain address by ID, tryGet
is used. While ignoring the success value, this means that a chain ID outside of the mapping will return the zero address. This leads to two considerations:
ValidatorTimelock
and BridgeHub
) make use of the getHyperchain
function to perform calls to the chain ID's address. For a non-existent ID a call to the zero-address would lead to an inexplicit revert. Consider using the get
function with a custom error message within getHyperchain
to fail more explicitly in these cases.createNewChain
function execution is skipped when a chain ID is already registered. This is done by checking that the getHyperchain
function output is not zero. Following the previous suggestion, a revert would not be desired here. Hence, consider using the contains
function to see whether a chain ID is part of the map.When applying above suggestions it is crucial to have an extensive test suite in place that confirms the functional correctness of these changes. Most importantly, evaluate other instances where a call to getHyperchain
must not revert.
Update: Partially resolved in pull request #403 at commit 1995a97. The ZKsync Association team stated:
Partially fixed. We fixed the issue with
getAllHyperchains
, but decided to keepgetHyperchain
the same as a convention we use across all other zkSync contracts.
Throughout the codebase, there are multiple code instances that do not have docstrings:
Bridgehub
contract in Bridgehub.sol
.receiveEth
function in L1SharedBridge.sol
.Consider thoroughly documenting all contracts and functions (and their parameters) that are part of any contract's public API. Functions implementing sensitive functionality, even if not public, should be clearly documented as well. When writing docstrings, consider following the Ethereum Natural Specification Format (NatSpec).
Update: Resolved in pull request #432 at commit ca5f7e1.
Throughout the codebase, there are several parameters which are not self-explanatory and could benefit from additional docstrings. For instance:
_batchNumber
, _index
, _message
, and _proof
parameters of the proveL2MessageInclusion
function in Bridgehub.sol
._batchNumber
, _index
, _log
, and _proof
parameters of the proveL2LogInclusion
function in Bridgehub.sol
._l2TxHash
, _l2BatchNumber
, _l2MessageIndex
, _l2TxNumberInBatch
, _merkleProof
, and _status
parameters of the proveL1ToL2TransactionStatus
function in Bridgehub.sol
._target
and _targetChainId
parameters of the transferFundsFromLegacy
function in L1SharedBridge.sol
._prevMsgSender
, _l1Token
and _amount
parameters of the bridgehubDepositBaseToken
function in L1SharedBridge.sol
._prevMsgSender
, _l2Value
and _data
parameters of the bridgehubDeposit
function in L1SharedBridge.sol
._txDataHash
and _txHash
parameters of the bridgehubConfirmL2Transaction
function in L1SharedBridge.sol
._prevMsgSender
parameter of the depositLegacyErc20Bridge
function in L1SharedBridge.sol
.Consider documenting to help with legibility and understanding of the code.
Update: Resolved in pull request #433 at commit f1d23df.
The codebase under review has no tests. Insufficient testing, while not a specific vulnerability, implies the high probability of additional hidden vulnerabilities and bugs. Testing provides a full implicit specification along with the exact expected behaviors of the codebase, especially important when adding in novel functionality. A lack thereof increases the chances that correctness issues will be missed (like Immutable Variable Not Set In Constructor). It also results in more effort to establish basic correctness and reduces effort spent exploring edge cases, thereby increasing the chances of missing complex issues.
Consider implementing unit and invariant tests for the codebase to extend the security guarantees of the protocol.
Update: Acknowledged, will resolve. The ZKsync Association team stated:
Acknowledged. The code in scope wasn't finalized fully and expect another iteration. We would deliver the complete coverage for the next audit stage.
The following instances of incorrect documentation were identified:
Guardians
contract has the Natspec title of "Security Council", thereby indicating the wrong contract.ProtocolUpgradeHandler
code is sectioned with comments. One of these sections says "FALLBACK", although the following function is the receive
function.The following instances of incomplete documentation were identified:
Guardians
contract could be mentioning the veto aspect.guardiansApproval
field of the UpgradeStatus
struct is not documented.proveL2MessageInclusion
function of the IZkSyncEra
interface could be documented.Consider addressing above issues to improve code clarity.
Update: Resolved at commit d701a29. The ZKsync Association team stated:
Fixed. The "fallback" section in the code is intentional, even though the only
receive
function is implemented. The section provided the overall description of functions under the common name.
When checking signatures in Multisig.sol
, a while
loop is used to efficiently match a member
to a recovered signer, otherwise, the counter currentMember
increases until one is found. Since the address array members
is sorted and has a fixed length (8 for guardian members and 12 for security council members), an array out-of-bounds
error will occur in the following scenarios:
Consider expressing more meaningful error message for each above scenario for better revert diagnosis.
Update: Acknowledged, not resolved. The ZKsync Association team stated:
Acknowledged. The proposed change requires adding complexity to the small but very error-prone logic.
When executing an upgrade proposal, a low-level call is performed on an arbitrary target with arbitrary data. If the call is not successful, it intends to revert with an error message. However, when the target is an account without code, the success
variable returned by the call is true
, thus resulting in risks of silent failures.
Consider checking if the target has code when the data is non-empty.
Update: Resolved at commit 5341815.
There are two functions, namely unfreeze()
and reinforceUnfreeze()
, that anyone can call to unfreeze all hyperchains after the block timestamp passes the protocolFrozenUntil
value.
However, depending on which of these two functions are called, there could be confusion in freezeStatus
as well as events emitted.
unfreeze()
is called, the freezeStatus
is reset to None
, the protocolFrozenUntil
is reset to 0
, and an Unfreeze()
event is emitted.reinforceUnfreeze()
is called, the freezeStatus
remains as it is, which could be Soft
or Hard
despite it has been unfrozen, and protocolFrozenUntil
remains as it is with a ReinforceFreeze()
event emitted.Thus there are two paths for performing the same functionality, but with different state variable updates and event emissions. Consider only allowing one path for anyone to unfreeze after the expiration timestamp for operational clarity and consistency.
Update: Partially resolved at commit 15831c6. The ZKsync Association team stated:
Fixed. We decided to keep two functions but restrict unfreeze to happen only once.
When initiating a softFreeze
from SecurityCouncil.sol
, the number of signatures required is specified by the softFreezeThreshold
variable that can be set to 0 through the setSoftFreezeThreshold
function. In such a scenario, no signature is required to initiate a softFreeze
on all hyperchains, thereby exposing this privilege to everyone.
Consider setting a non-zero minimum number of signatures to initiate a softFreeze
or explicitly documenting the intention of allowing no signatures.
Update: Resolved at commit da33bf6.
Multisig
The constructor
of Multisig.sol
does not check if the _members.length
is greater than or equal to the _eip1271Threshold
. Thus it is possible to instantiate a Multisig
that sets a higher threshold for EIP1271_THRESHOLD
than available members. This would render the isValidSignature
function unusable, thus its external integration would also revert.
Although both SecurityCouncil
and Guardians
have instantiated Multisig
correctly, consider checking that the EIP-1271 threshold is not greater than the member count to avoid any potential mis-instantiation in the future.
Update: Resolved at commit fd86a0d.
The emergency upgrade process unfreezes the bridges and hyperchains through the ProtocolUpgradeHandler
once the upgrade is executed. This is performed by invoking the internal _unfreeze
function which does not fire either of the {Reinforce}Unfreeze
events. Thus, the contracts would resume to be operable without having emitted the respective event, besides emitting EmergencyUpgradeExecuted
.
Consider emitting the Unfreeze
event by adding it to the emergency upgrade function. Alternatively, consider making the code more self-contained by having the emergency board calling the unfreeze
function separately after the upgrade execution.
Update: Resolved at commit 8b62b1c.
initialCutHash
May Go out of Sync With protocolVersion
When creating a new chain through the StateTransitionManager
, its protocol version is initialized as the StateTransitionManager
's latest protocolVersion
. Further, the DiamondCutData
to initialize all facets is determined by checking its hash against the initialCutHash
, which can be updated by the owner. Over the course of protocol upgrades, various facets might get added and removed. Thus, it is important that initialCutHash
is in sync with the initialized protocol version to ensure that a new chain is initialized appropriately.
However, this requires the owner to update the initialCutHash
separately at each upgrade. This creates an opportunity to introduce errors that lead to a mismatch between a chain's version and implementation. Consider having a way to automatically update and validate the initialCutHash
with each upgrade so that the initialCutHash
is always consistent with the protocolVersion
. Alternatively, consider adding the protocol version to the InitializeDataNewChain
struct such that the version will be provided through the _diamondCut
input and hence checked to match the initialize diamond cut according to the initialCutHash
. Note that this would mean to reorder the InitializeData
struct.
Also consider changing the name of initialCutHash
to, for example, cumulativeCutHash
to explicitly emphasize that this is an accumulative quantity that takes into account of all the upgradeCutHash
(which represents diffs) up to the current version.
Update: Acknowledged, will resolve. The ZKsync Association team stated:
Acknowledged. This is a good finding we would address in one of the next protocol upgrades.
Within L1SharedBridge.sol
, there are state variables that lack an explicitly declared visibility. For instance:
For clarity, consider always explicitly declaring the visibility of variables, even when the default visibility matches the intended visibility.
Update: Resolved in pull request #430 at commit d0e46f2. Consider making the non-constant variables public for easier querying of the values, while also extending the interface accordingly.
Providing a specific security contact (such as an email or ENS name) within a smart contract significantly simplifies the process for individuals to communicate if they identify a vulnerability in the code. This practice is quite beneficial as it permits the code owners to dictate the communication channel for vulnerability disclosure, eliminating the risk of miscommunication or failure to report due to a lack of knowledge on how to do so. In addition, if the contract incorporates third-party libraries and a bug surfaces in those, it becomes easier for their maintainers to contact the appropriate person about the problem and provide mitigation instructions.
The Bridgehub
contract does not have a security contact.
Consider adding a NatSpec comment containing a security contact above each contract definition. Using the @custom:security-contact
convention is recommended as it has been adopted by the OpenZeppelin Wizard and the ethereum-lists.
Update: Resolved in pull request #431 at commit 5458700.
Throughout the codebase, there are various functions with unnecessarily permissive visibility. For instance:
L1ERC20Bridge
contract, the _depositFundsToSharedBridge
function with internal
visibility could be limited to private
.L1SharedBridge
contract, the _depositFunds
, _getDepositL2Calldata
, _getERC20Getters
, _claimFailedDeposit
, _isEraLegacyEthWithdrawal
, _isEraLegacyTokenWithdrawal
, _isEraLegacyDeposit
, _finalizeWithdrawal
, _checkWithdrawal
and _parseL2WithdrawalMessage
functions with internal
visibility could be limited to private
.To better convey the intended use of functions and to potentially realize some additional gas savings, consider changing a function's visibility to be only as permissive as required.
Update: Acknowledged, not resolved. The ZKsync Association team stated:
Acknowledged. The contract is not designed to be inherited.
To deploy the ProtocolUpgradeHandler
contract, it requires the addresses of the SecurityCouncil
and Guardians
contracts. However, to instantiate the SecurityCouncil
and Guardians
contract, it requires an instance of the ProtocolUpgradeHandler
. This circular instantiation causes inconvenience during deployment and testing.
Consider adding documentation for a deployment plan with deterministic addresses.
Update: Acknowledged, will resolve. The ZKsync Association team stated:
Acknowledged. The address for each contract will be precalculated. Documentation and scripts planned for the next audit.
Within ProtocolUpgradeHandler.sol
, there are state variables that lack an explicitly declared visibility. For instance:
GUARDIANS_VETO_PERIOD
state variable.UPGRADE_DELAY_PERIOD
state variable.UPGRADE_WAIT_OR_EXPIRE_PERIOD
state variable.L2_PROTOCOL_GOVERNOR
state variable.ZKSYNC_ERA
state variable.For clarity, consider always explicitly declaring the visibility of variables, even when the default visibility matches the intended visibility.
Update: Resolved at commit 3ed956a and at commit 481dcf4.
Throughout the codebase, the following typographical errors were identified:
Consider addressing the issues above to ease the readability of the codebase.
Update: Resolved at commit 4fe0ac7.
The ProtocolUpgradeHander
implements the unrestricted public functions getUpgradeStatusNow
and updateUpgradeStatus
, which are not given in the IProtocolUpgradeHandler
interface.
For completeness, consider adding these functions to the interface.
Update: Resolved at commit 02d88c3.
The reinforceUnfreeze
function is missing from the interface IProtocolUpgradeHandler
. Consider adding it for code completeness.
Update: Resolved at commit 02d88c3.
In for-loops it is more gas efficient to increment the index with ++i
rather than i++
. Consider applying this change in the for-loop in the _freeze
and _unfreeze
function.
Update: Resolved at commit 60f2af5.
Throughout the codebase, the following typographical errors were identified:
Consider correcting the above mistakes to improve the code readability.
Update: Resolved at commit 41a7df1.
Throughout the codebase, there are state variables that lack an explicitly declared visibility. For instance:
EIP1271_THRESHOLD
state variable in Multisig.sol
.SOFT_FREEZE_PERIOD
state variable in ProtocolUpgradeHandler.sol
.HARD_FREEZE_PERIOD
state variable in ProtocolUpgradeHandler.sol
.SOFT_FREEZE_COOLDOWN_PERIOD
state variable in ProtocolUpgradeHandler.sol
.HARD_FREEZE_COOLDOWN_PERIOD
state variable in ProtocolUpgradeHandler.sol
.STATE_TRANSITION_MANAGER
state variable in ProtocolUpgradeHandler.sol
.BRIDGE_HUB
state variable in ProtocolUpgradeHandler.sol
.SHARED_BRIDGE
state variable in ProtocolUpgradeHandler.sol
.protocolFrozenUntil
state variable in ProtocolUpgradeHandler.sol
.softFreezeCooldownUntil
state variable in ProtocolUpgradeHandler.sol
.hardFreezeCooldownUntil
state variable in ProtocolUpgradeHandler.sol
.softFreezeNonce
state variable in SecurityCouncil.sol
.hardFreezeNonce
state variable in SecurityCouncil.sol
.softFreezeThresholdSettingNonce
state variable in SecurityCouncil.sol
.unfreezeNonce
state variable in SecurityCouncil.sol
.softFreezeThreshold
state variable in SecurityCouncil.sol
.For clarity, consider always explicitly declaring the visibility of variables, even when the default visibility matches the intended visibility.
Update: Resolved at commit 3ed956a and at commit 481dcf4.
The reinforceFreeze
and reinforceUnfreeze
functions have a loop over all hyperchain IDs to perform their respective action. In the rare case where the execution would get stuck at a particular ID for some unforeseen reason, it would be helpful to have reinforcement functions that can target a particular hyperchain ID.
Update: Acknowledged, will resolve. The ZKsync Association team stated:
Acknowledged. This will be resolved with the next audit together with using manual call error handling instead of try/catch.