Evolution of Smart Contract Security in the Ethereum Ecosystem

by Manuel Araoz

The following is an adaptation of my Devcon3 talk from November 2nd, 2017.

Pre-history: The Dark Age

In May of 2016, Peter Vessenes analyzed a sample of smart contracts published online to assess their complexity and security. His conclusion: “Ethereum contracts are going to be candy for hackers.” This observation was prescient, as just one month later The Dao hack occurred, resulting in the loss of 3.6 million Ether.

In the coming months, more vulnerabilities were uncovered as the community became increasingly concerned with smart contract security. Yet even by Devcon2—which was held in September of 2016—smart contracts were still riddled with poor programming practices. That year, roughly 50% of projects holding significant amounts of funds were hacked.

A lack of best practices exacerbated frequent programming errors. For example, Rubixi was a ponzi scheme formalized in a smart contract. The original name of the contract was DynamicPyramid, but the developer renamed it to Rubixi. In the process, they forgot to update the constructor name, turning it into a public function. Now, anyone could call the function and become the owner of the contract, allowing them to steal funds.

The Age of Enlightenment

In September of 2016, we started OpenZeppelin to help standardize best practices for smart contract development. Today, more than $1.5 billion of digital assets are powered by OpenZeppelin smart contracts.

While we helped jumpstart a community effort around smart contract audits and security, the Ethereum platform itself has also matured. We’ve seen the introduction of ERC20 tokens which have improved interoperability, along with security fixes like EIP150, which removed the possibility of a callstack attack. Solidity implemented new keywords such as require, assert, transfer, and revert, all of which reduced the difficulty of building secure smart contracts.

We’ve also seen outdated tech phased out. In June of 2017, Zeppelin completed its audit of Serpent, finding eight critical vulnerabilities. This discovery lead us to find a critical vulnerability in the REP token (which used Serpent) that allowed anyone to freeze the REP token forever. Serpent was subsequently deprecated, with work focusing on its successor: Viper.

Most recently, the Byzantium hard fork brought several new tools to improve security. The addition of the big integer modular exponentiation now allows for RSA signature verification. The STATICCALL opcode allows for safer calls to untrusted contracts, and the combination of new opcodes for return data handling and the revert opcode creates better ways to implement upgradability proxies.

New Security Patterns & Techniques

While we’ve seen the introduction of new security upgrades and tools, the craft of smart contract development has also made strides.

Adding features safely
Let’s say you’re a developer learning Solidity and you just created your first ERC20 token. If you need to add a new feature to the token, you’ll want to add as little extra code as possible.

For example, you want to allow any token holder to lock tokens for others. To do this, you might:

  1. Add an extra lock function that transfers the tokens and stores when they should be released.
  2. Modify the standard transfer functions to honor these additional restrictions.

Below we have the lock function, which receives the beneficiary, value to be locked, and fund release time. It immediately transfers the tokens and adds a lock object to the lock array for that beneficiary.

Then, we need to modify the standard transfer functions in order to prevent the beneficiary from sending those tokens before the locks have expired. We do so by adding the canTransfer function modifier, which in turn checks that the amount sent is lower than the transferable tokens for that address at that moment.

And to calculate that, we just iterate over the array of locks for an address, add up the amount of tokens still locked, and return the total balance minus the locked tokens.

But there’s a problem: we have a gassy array! Because anyone can lock tokens for an address, the length of the array can be controlled by an attacker, making it arbitrarily long. Given that this function is called every time a transfer is attempted, if the array is too long (>5000 elements in this example), the transaction will cost too much gas to fit in a single block.

While this could be fixed by adding a max amount of locks per address, there’s an alternative method that would also work: a modular approach. The idea is to leave the token in its original form and implement the locking functionality as an external contract. When a user wants to lock tokens, they instantiate a version of this token timelock, specifying the beneficiary and release time.

Now, you have equivalent functionality without modifying the token code. This means that any bug in the locking functionality will not affect the normal operation of the token.

The point is to be very careful when adding new features. Adding features is never a risk-free process. Every extra line of code expands the attack surface. If you decide to add functionality, consider following the modular approach described above.

We’ve seen this approach in the evolution of crowdsale code. Initially, all crowdsales were implemented as part of the token code. Today, almost all crowdsales are separate contracts. It doesn’t make sense to have the crowdsale handling code inside a token forever when the crowdsale takes place over a finite amount of time.

What Can Software Engineering Teach Us About Smart Contract Security?

While there are many new security patterns and techniques we can use to increase smart contract security, there are also some basic software engineering practices that everyone should be adopting.

  1. Clear and simple code is always better for security. All security problems come from the difference between the programmer’s intention and what the code actually does.
  2. Naming is key. The harder it is to read the code, the harder it is to audit and maintain.
  3. Reuse existing code. You can’t imagine how many times we’ve audited reiterations of the ERC20 standard token.
  4. Don’t copy-paste. When possible, import it as a dependency.
  5. Don’t repeat yourself. If you see repeated logic in your codebase, extract it to generic functions, create function modifiers, or create more generic contracts.
  6. Write tests. This is the best way to see if your code matches your intentions. It also keeps you safer from regression problems when modifying the code.

Security and Trust Reduction

Shifting to the relationship between security and trust reduction, let’s say you’re a freelance smart contract developer and your client asked for a capped crowdsale.

Afterward, your client wants to add some functionality, and asks for the crowdsale to give tokens to the foundation. So you make the contract Ownable and add a function for finalization that is only accessible by the owner.

Now the client comes back and asks you to make it trustless. So, you remove the Ownable dependency and add the finalization logic when the amount raised reaches the cap.

But now, this change makes your contract vulnerable to an attack! Any investor can call the buyTokens function with msg.value equal to 0, and repeat the token minting for the foundation arbitrarily, ruining your crowdsale. And did we really remove trust from the crowdsale by leaving the finalization logic in the hands of the last investor? Usually, leaving critical functionality of the contract in hands of the public is bad idea as it gives more tools to attackers.

The lesson here: Sometimes it’s ok to reduce trustlessness in order to increase security.

Open Problems in Smart Contract Development/Security

Finally, I want to cover some open problems in smart contract development.

The biggest issues in smart contract development today are:

  1. Code duplication—contracts using the same functionality redeploy the same code.
  2. Upgradeability—both in terms of improving and bug-fixing deployed contracts
  3. Interoperability—among various smart contracts
  4. Gas costs—the expenses involved in running smart contract operations

We’re tackling these limitations with our upcoming project: zeppelin_os. It’s an open-source effort to improve the smart contract development and management experience.

For code duplication in the blockchain, we’re converting OpenZeppelin to an on-chain library, which we call the Kernel. This way, contracts can link the library dynamically, reducing their contract bytecode size.

The Kernel will also be upgradeable via a proxy mechanism, and all contracts connecting with zeppelin_os will receive free bugfixes. Upgrades will be controlled by a token-based governance system.

Two additional components of the OS are the scheduler and the marketplace.

Consider two contracts talking to each other, such as a multisig calling a privileged method in a crowdsale. Who pays for the gas cost of that operation? With most implementations of multisig wallets, it’s the last owner of the multisig signing the transaction.

With the scheduler, the multisig can emit an event when it’s ready to execute the operation, and anyone (not necessarily the multisig owners) can trigger that call and pay the gas costs. This same mechanism can be used to enable smart contracts to do asynchronous execution by requesting the network to call them back at a later time.

Moreover, if we want interesting interactions between contracts, we need better tools. The marketplace will be a set of standards and tools to allow smart contracts to charge for their services. We’re standardizing the ways smart contracts can request payments for their services (eg. with per call, monthly, or one time fees), and also how other contracts can pay for those.

Finally, we’re creating a full suite of off-chain tools to manage, deploy, and develop smart contracts more easily with greater security.

All this is part of the zeppelin_os project, and at Zeppelin, we aim to build the best infrastructure and tools for the industry.