Dojo Security Review

Written by OpenZeppelin Security | November 12, 2024

Summary

Type: Gaming
Timeline: From 2024-09-02; To 2024-09-12
Languages: Cairo

Total Issues: 11 (9 resolved)
Critical Severity Issues: 1 (1 resolved)
High Severity Issues: 1 (1 resolved)
Medium Severity Issues: 1 (1 resolved)
Low Severity Issues: 4 (4 resolved)
Notes & Additional Information: 4 (2 resolved)

Scope

We reviewed the dojoengine/dojo repository at commit e1a349b.

In scope were the following files:

 crates/dojo-core/src
├── contract
│   ├── base_contract.cairo
│   ├── contract.cairo
│   └── upgradeable.cairo
├── lib.cairo
├── model
│   ├── introspect.cairo
│   ├── layout.cairo
│   ├── metadata.cairo
│   └── model.cairo
├── storage
│   ├── database.cairo
│   ├── layout.cairo
│   ├── packing.cairo
│   └── storage.cairo
├── utils
│   └── utils.cairo
└── world
    ├── config.cairo
    ├── errors.cairo
    ├── update.cairo
    └── world_contract.cairo

System Overview

Dojo abstracts away the complexities of on-chain development to allow developers to focus on other aspects of their applications. This is achieved through a toolchain for building, migrating, deploying, proving, and settling extensible application state, known as "worlds", in production. An important feature of Dojo is that it provides a way to create provable game states that depend on off-chain transitions through zero-knowledge (ZK) proofs. This helps solve an issue with on-chain games having state and logic on public networks with all state transitions being verified by node providers, which means that valid transitions are bounded by the cost and computational limits of the network. Another important feature is Dojo's set of macros that turn into comprehensive queries at compile time. This framework provides an abstraction over the contract storage architecture to dynamically create arbitrarily complex tables, as well as an interface to read and write records according to arbitrary business logic. This review focussed on the associated smart contracts.

Dojo is made up of three parts:

World: Acts as the database maintaining all models, systems, and permissions. These permissions are further described in the Privileged Roles section. Upon state changes, the world emits an event to notify the Torii indexer tool.
Models: Stores structured data of the world defined with #[dojo::model]. Essentially, these are Cairo structs that automatically implement the Introspect trait which outlines the data structure for the model so that changes can be easily tracked.
Systems: Functions for updating the game state and interacting with the world contract defined with #[dojo::contract]. These hold the game logic and are able to change the world state, provided they have explicit permission from the relevant model owner. Permissions are defined at the contract level, which means that all the functions inside the same contract will inherit the same permissions.

The basic insight is that any complex data structure can be described recursively as a collection (i.e., array, struct, tuple, or enum) of either sub-collections or fixed-sized data. The fixed-size data is also divided into chunks of at most 256 words. In this way, a table record can be described with a tree structure, where a child node in the tree represents a single element in a collection (which itself could be a collection). The leaves are fixed-size data chunks and are saved at a pseudorandom storage location derived from the corresponding path in the tree. This makes it possible to read or write any member of an arbitrarily complex collection without interfering with the other members. Anyone can define the data structure of a table using a Model contract, and the Dojo framework handles the corresponding mapping to storage locations.

Security Model and Trust Assumptions

The World database is intended to be open and extensible with minimal restrictions. The overall principle is that an address with write access to a storage area (detailed below) can modify the storage arbitrarily. The database does not enforce table structures, input validation rules, reactive hooks or any sort of data consistency checks. The only rule is that different storage areas are isolated so they do not interfere with each other.

This means that users should trust that all addresses with write access will only publish valid data and will respect the intended table structures to avoid corrupting the storage area. In practice, applications can achieve this by granting access to Systems (smart contracts) that enforce the application-level logic and consistency requirements.

Privileged Roles

The system defines the following resources:

The World contract is a top-level container for all other resources.
There is an arbitrary number of Namespace partitions inside the World.
Database tables are represented by Model smart contracts inside a namespace.
Business logic is implemented by Contract smart contracts deployed from a namespace.

Each resource can have any number of owner and writer addresses, which follow the same hierarchy. This means a world owner implicitly owns all the other resources, and a namespace owner or writer implicitly has the same role for models and contracts inside the namespace.

The generic powers of an owner address include:

The ability to both grant and revoke the owner and writer roles for the resources.
The ability to set and update an arbitrary metadata_uri for each resource, which can be used to provide additional information associated with the resource.

The specific roles and powers are distributed as follows:

Whichever address deploys the World becomes its first owner.
Any address can claim an unused Namespace and become its owner.
A namespace owner or writer can:
- Register a new model in the namespace.
- Replace the contract associated with a model in the namespace.
- Deploy a new system contract in the namespace.
A system contract owner can:
- Initialize the contract with arbitrary calldata.
- Upgrade the contract with a new implementation.
A model owner or writer can:
- Set a record (or component of a record) in the model.
- Delete a record (or component of a record) from the model.
A world owner can:
- Upgrade the whole World contract implementation.

There is also a mechanism for a world owner to update arbitrary storage locations using a mechanism configured by the original world creator. The expectation is that this could be used by the L3 architecture to directly prove and update the world state analogously to how L2 ZK rollups publish periodic state updates to L1.

Critical Severity

Resources Can Be Overwritten

The deploy_contract function registers the system under a selector chosen by the contract. This means that an attacker can deploy a malicious contract in order to overwrite an existing resource record from any namespace.

When registering or upgrading a model, the selector is validated to not be registered under another resource. However, this check is missing for the deploy_contract function. There are several potential consequences but the most notable is that the attacker could become an owner of the WORLD resource, allowing them to update any model.

Consider ensuring that the selector is unused before registering the new contract.

Update: Resolved in pull request #2.

High Severity

Inconsistent namespace separation

When a new namespace is created, its identifier is the hash of its name. However, when registering or updating a model, the namespace and its hash are provided by the model contract and are not validated to be consistent with each other. Similarly, when deploying a new contract, the namespace and its hash are provided by the contract and are not checked for consistency. Since the hash is used for access control while the name is used in the corresponding events, any discrepancy between the two would produce an incorrect transcript of the ModelRegistered, ModelUpgraded, or ContractDeployed events.

Moreover, when creating or upgrading a model, the selector is provided by the new model and is not inherently dependent on the namespace. This means that if the caller has the appropriate permissions, the ModelUpgraded event can change the name and namespace of the model, which would create a nonsensical transcript. Although these examples do not corrupt the database directly, one of the design principles of the system is that the Torii indexer should be able to reconstruct and provide the state from events. Thus, any inconsistency would mislead users of the indexer.

Another consequence of this design is that well-behaved models and contracts derive their selectors from their other properties (as demonstrated by the metadata model). This means that they are determined at compile-time and are resistant to change. However, attackers can preempt model and contract deployments in order to claim those selectors, thereby preventing (or at least complicating) valid deployments. This could occur in real time with front-running attacks, or by simply reviewing or predicting selectors from open source code. A robust namespace mechanism should prevent this possibility.

To this end, consider enforcing consistency between all provided namespaces and their hashes, ensuring that the selectors are derived from the relevant name and namespace. The hashes could be cached when first derived to avoid recomputing them multiple times. It is worth noting that a model provides its namespace hash every time it is updated, which means it can arbitrarily change its namespace for the purpose of access control. A similar observation applies to contract and model tags which do not have to match their names or namespaces and can change over time. These would not affect event transcripts, but they might be worth considering in any potential updates to the namespace consistency requirements.

Update: Resolved in pull request #2.

Medium Severity

Imprecise Permissions

The codebase uses a hierarchy of permissions for most access control. In particular, a WORLD owner indirectly owns all resources while a namespace owner indirectly owns all models inside the namespace. However, there are some deviations from this model:

Only a direct contract owner can initialize the contract. A namespace owner cannot call this function whereas a world owner must first grant themselves direct ownership.
To upgrade a model, the caller must have namespace access and must also be a direct owner. Ideally, either permission would be sufficient. In addition, a world owner must first grant themselves direct ownership. This pattern is necessary as long as the selector does not depend on the namespace but may be simplified when the Inconsistent namespace separation issue is resolved.

Consider strictly adhering to the hierarchical permissions by fixing these deviations from the model.

Update: Resolved in pull request #4.

Low Severity

Incorrect Error Message

When upgrading a contract, if the selector does not correspond to a known contract, it panics with an invalid_resource_selector error. However, the code already establishes that the selector is known. This may confuse the user as to why the execution failed due to the invalid_resource_selector error.

Consider using the resource_conflict error in this panic case, providing more clarity on why the program failed to execute.

Update: Resolved in pull request #5.

Missing Validation

In contrast to most storage layout functions, write_enum_layout reads a value without first ensuring that it is within the span.

Consider adding a validation in the write_enum_layout function which ensures that the value being read is within the span. This will help improve code clarity and readability.

Update: Resolved in pull request #6.

Incorrect Introspection Sizes

The Introspect implementations of the signed integers all describe the layout as requiring 251 bits. All models automatically implement the Introspect trait which outlines the data structure for the model. If a user is defining some built-in Cairo types, they just have to derive Introspect, but if the type is an unsupported type, then they have to manually implement the Introspect trait. Each built-in Cairo type describes the layout and has a specific bit size corresponding to the data type. However, this is not the case for the signed integers which, again, always require 251 bits.

Consider defining the layout using the data size of the signed integers.

Update: Resolved. Not an issue.

In Cairo, all signed integers are represented using the full felt252 size.

Misleading Comments

Throughout the codebase, multiple instances of misleading comments were identified:

The set_differ_program_hash function comment describes a non-existent config_hash parameter.
The deploy_contract function comment describes a non-existent init_calldata parameter.
The set function comments are missing the offset parameter.
The revoke_writer function comments incorrectly mention "model" (instead of "resource") multiple times.

Consider updating the comments to align them with the code implementation. This will help improve the readability and maintainability of the codebase.

Update: Resolved in pull request #7.

Notes & Additional Information

Code Simplifications

Throughout the codebase, multiple opportunities for code simplification were identified:

This code unnecessarily assigns a new variable y. Instead, x can be used.
The MIN_BYTE_ARRAY_SIZE constant is redundantly cast into a u32 type.
There are multiple test files (1, 2, 3) interspersed with production code. They could be moved to separate files.
There are unused errors (1, 2) that could be removed.
The calculate_packed_size function assumes at least one element with a non-zero size. Even if this is true in the codebase, for the sake of logical reasoning, it should either start from 0 or this assumption should be asserted.
Since the main if block always exits the function immediately, this else block is unnecessary.

Consider implementing the aforementioned code simplification suggestions to improve the clarity and maintainability of the codebase.

Update: Acknowledged, will resolve.

Naming Suggestions

Throughout the codebase, multiple opportunities for improved function and variable naming were identified:

The fpow function uses the square-and-multiply version of the algorithm. As such, the double variable should be called square.
base_address has type StorageAddress (not StorageBaseAddress) so calling it base_address can cause confusion.
The name fill_with_zeroes assumes that it will be passed an empty array. Instead, it should be called append_zeroes to represent its actual behavior.

Consider implementing the above-listed renaming suggestions to improve code clarity.

Update: Acknowledged, will resolve.

Typographical Errors

Throughout the codebase, multiple instances of typographical errors were identified:

alligment should be "alignment".
poped should be "popped".
packingfelt is missing a space.
Multiple instances of "it's" being used incorrectly were identified in the comments of the deploy_and_get_metadata and get_differ_program_hash functions, and in the panic_with_byte_array panic error. All of these should be "its".
The following comment "Panics if the caller has NOT the writer role on the namespace." for the assert_caller_namespace_write_accessfunction should be "Panics if the caller does NOT have the writer role on the namespace." A similar mistake was found in assert_caller_model_write_access.
This comment for the any_none function is missing a word.

Consider correcting the aforementioned typographical errors to improve the readability of the codebase.

Update: Resolved in pull request #8.

Magic Numbers

Using values directly without first storing them in a constant (i.e., "magic numbers") could lead to confusion for future developers and readers of the codebase. Throughout the codebase, multiple instances of unexplained magic numbers were identified.

In the database.cairo file, 0 is used to refer to the address domain multiple times (1, 2, 3, 4).
In layout.cairo, the literal 251 has been used several times to refer to the PACKING_MAX_BITS constant.

To improve the readability of the codebase, consider storing fixed values in constants and then using those in the code.

Update: Resolved in pull request #9.

Conclusion

Dojo is a framework that is used to implement flexible and user-friendly blockchain databases, which can then be used to create provable games and autonomous Worlds. Dojo provides an abstraction layer over the contract storage architecture to dynamically create arbitrarily complex tables, along with interfaces to read and write records according to custom business logic.

The security review yielded one critical-severity and one high-severity issue. In addition, multiple recommendations were made to ensure that the codebase follows best practices, reducing the potential attack surface and enhancing overall code quality. The Cartridge Dojo engine team was very forthcoming and always responded to our questions promptly while providing helpful information.

View full post